Entrepreneurship Introduction – Definitions, Traits and Characteristics

Entrepreneurship Introduction – Entrepreneurship majorly drives the modern global economy. Entrepreneurship is the primary source of job creation. Entrepreneurship in its broadest sense is any enterprise or effort that adds values to the lives of people.

Entrepreneurship Introduction

Entrepreneurship is popularly understood as the setting up of new business ventures. Everything all around has its roots in entrepreneurship in some way. It is the innermost component in the economic growth. It serves as a critical spur for the money making, introduction of new goods and services, as well as the opening of new markets to innovations.

Entrepreneurship is the philosophy in which an individual is an imaginative and innovative agent with an aspiration for ownership and the right to make proprietary decisions, a body of knowledge. Entrepreneurship on the further hand is the procedure of doing something new or creative or innovative to create wealth for individual and value to the society.

An entrepreneurial approach is appropriate business management in general, including the creations of new ventures, managing one’s own business, business with family members, government or public institutions, charitable and non-profit organizations as well as the professional team.

 Definitions of Entrepreneurship

“Entrepreneurship is the process of bearing the risk of buying at certain prices and selling at uncertain prices.” Paul Di-Masi

“An individual who bears the risk of operating a business in the expression of uncertainty about the future conditions.” Encyclopedia Britannica

“Entrepreneur is one who innovates and introduces something new in the economy.” Joseph A. Schumpeter

“Entrepreneurial activity involves identifying opportunities within the economic system.” Penrose

“Entrepreneurship in one who endowed more than the average capacities in the task of organizing and coordinating various factors of production.” Francis A. Walker

“Entrepreneur recognizes and acts upon market opportunities. The entrepreneur is essentially an arbitrageur.” Israel Kirzner

“Innovation is the specific tool of our entrepreneurs.” Peter F.Drucker

Traits of an Entrepreneur

  • He is the person who develops and owns his own
  • He is the moderate risk taker and works under uncertainty for achieving the goals.
  • He is innovative.
  • He peruses the deviant
  • Reflects strong urge to be independent.
  • Persistently tries to do something better.
  • Dissatisfied with routine activities.
  • Prepared to withstand the hard life.
  • Determined but patient.
  • Exhibits sense of leadership.
  • Takes personal responsibility.
  • Oriented towards the future.
  • Convert a situation into an opportunity.
  • Tends to continue in the face of difficulty.
  • Also, exhibits the sense of competitiveness.

Characteristics of an Entrepreneur

Mental Ability

An entrepreneur must have creative thinking and must be able to analyze problems and situations. He should be able to anticipate changes.

Business Secrecy

He should guard his business Secrets against his competitors.

Clear Objectives

He must have clear objectives as to exact nature of goods to be produced.

Human Relations

He must maintain good reactions with his customers, employees, etc. to keep the right relationship. He should have heartfelt stability, personal relations, tactfulness, and consideration.

Communication Ability

He should have communication skills means that both the sender and the receiver should understand each other’s message.

Marketing Environment – Types, Analysis, Influence, Internal and External

Marketing environment study and analysis are the key factors to the success of any organization. Companies who ignored this have passed through the difficult time in past. In this article, we are going to understand the types of the marketing environment, reasons to study and its influences.

Types of Marketing Environment

Companies interact with two kinds of the environment which are microenvironment and macroenvironment.

The microenvironment can be classified into internal environment and external environment.

Internal environment consists of firms management structure, i.e., organization strategies, objectives & departments within the company and internal characteristics which can affect its ability to serve its customers.

External environment consists suppliers, marketing intermediaries, customers, competitors and public shareholders. Public shareholders may include local interest groups which concern about marketers impact on the environment or a local employment.

Macro environment consists of broader forces that affect demand for companies goods; these numbers may include demographics, economics, nature, technology, politics, and culture.

Why do we analyze Market environment?

It is essential to take an outside-inside view of your business to be a successful organization. You need to recognize that the marketing environment is always presenting new opportunities and threats. Successful companies also realize the importance of continuously monitoring and adapting to the environment.

A company that has consistently reinvented one of its brands to keep up with changing marketing environment is “Mattel” with its “Barbie doll.” Many businesses fail to see the change as a prospect. They ignore or resist changes until it is too late. Their strategies, structures, systems and organizational culture grow increasingly obsolete and dysfunctional.

Corporations as mighty as Nokia, General Motors, IBM, and Sears have passed through a difficult time because they ignored macro environmental changes.

Hence it can be concluded that successful companies used to recognize and respond efficiently towards environmental needs and trends.

Reasons for studying Market Environment.

It is vital to study internal and external environment of the market to assess strength, weaknesses, opportunities, and threats presented by marketing environment.

Why consider internal environment?

The internal environment is the environment prevailing within the organization. It is an organizational environment regarding its

  1. Business resources like brands, product features and product lines, competitive advantage, core competencies.
  2. Infrastructures like men, machine, methods, materials, and money
  3. Relationships with stakeholders regarding business.

Internal environment study helps us in realizing organizational capabilities concerning the performance of different business functions. It directs towards analysis of strength and weakness within an organization.

Strong areas or power of an agency may help to explore long-term advantage in the market. Analysis of the weak regions or deficiencies may assist in realizing their marketing objectives accordingly.

Why study external environment?

External business conditions comprise of favorable events and unfavorable events. Desirable events if adequately explored can result in leading a successful business.

Hostile circumstances if not efficiently managed may pose tough times for the organizations. Scanning of external environment may lead to the exploration of opportunities and threats.

Opportunities may help a group taking advantage of the external environment. Research of risks supports an organization to construct and implement a plan to safeguard their business interest.

The primary purpose of environmental scanning is to recognize new marketing opportunities. The marketing opportunity is an area of a buyer need or potential interest in which a company can perform profitably. Chances can be in many forms and marketers must have an ability to spot them.

Influence of Marketing Environment

Marketing environment may affect various factors including wants, primary and secondary research, consumer behavior, demand for goods, SWOT, types of products, etc.

  1. Wants will be influenced by levels of economic prosperity, social and cultural factors.
  2. Macro environment issues will be critical areas for firms to do primary and secondary research.
  3. Technological, economic, social and cultural factors may influence consumer behavior.
  4. Economic, political, technological factors may affect demand for goods.
  5. Types of products developed by the organization will have to take into account with the micro and macro environment.
  6. Economic factors may determine which markets are more profit giving than others. Low levels of commercial success are not always bad for growth, for example in developing countries degree of local competition may be low.
  7. The changing macro environment may influence strengths/weaknesses, opportunities, threats faced by an organization.

Pig Operators – Pig Input, Output Operators, Pig Relational Operators

Input, output operators, relational operators, bincond operators are some of the Pig operators. Let us understand each of these, one by one.

Pig Input Output Operators

Pig LOAD Operator (Input)

The first task for any data flow language is to provide the input. Load operator in the Pig is used for input operation which reads the data from HDFS or local file system.

By default, it looks for the tab delimited file.

For Example X = load ‘/data/hdfs/emp’; will look for “emp” file in the directory “/data/hdfs/”. If the directory path is not specified, Pig will look for home directory on HDFS file system.

If you are loading the data from other storage system say HBase then you need to specify the loader function for that very storage system.

X = load ’emp’ using HBaseStorage();

If we will not specify the loader function then by default it will use the “PigStorage” and the file it assumes as tab delimited file.

If we have a file with the field separated other than tab delimited then we need to exclusively pass it as an argument in the load function. Example as below:

X = load ’emp’ using PigStorage(‘,’);

Pig Latin also allows you to specify the schema of data you are loading by using the “as” clause in the load statement.

X = load ’emp’ as (ename,eno,sal,dno);

If we will load the data without specifying schema then columns will be addressed as $01, $02, etc. While specifying the schema, we can also specify the datatype along with column name details. For Example:

X = load ’emp’ as (ename: chararray, eno: int,sal:float,dno:int);

X = load ”hdfs://localhost:9000/pig_data/emp_data.txt’ USING PigStorage(‘,’) as (ename: chararray, eno: int, sal:float, dno:int);

Pig STORE Operator (Output)

Once the data is processed, you want to write the data somewhere. The “store” operator is used for this purpose. By default, Pig stores the processed data into HDFS in tab-delimited format.

Store processed into ‘/data/hdfs/emp’;

PigStorage will be used as the default store function otherwise we can specify exclusively depending upon the storage.

Store emp into emp using HBaseStorage();

We can also specify the file delimiter while writing the data.

Store emp into emp using PigStorage(‘,’);

Pig DUMP Operator (on command window)

If you wish to see the data on screen or command window (grunt prompt) then we can use the dump operator.

dump emp;

Pig Relational Operators

Pig FOREACH Operator

Loop through each tuple and generate new tuple(s). Let us suppose we have a file emp.txt kept on HDFS directory. Sample data of emp.txt as below:

mak,101,5000.0,500.0,10
ronning,102,6000.0,300.0,20
puru,103,6500.0,700.0,10

Firstly we have to load the data into pig say through relation name as “emp_details”.

grunt> emp_details = LOAD ’emp’ USING PigStorage(‘,’) as (ename: chararray, eno: int,sal:float,bonus:float,dno:int);

Now we need to get the ename, eno and dno for each employee from the relation emp_details and store it into another relation named employee_foreach.

grunt> employee_foreach = FOREACH emp_details GENERATE ename,eno,dno;

Verify the foreach relation “employee_foreach”  using DUMP operator.

grunt> Dump employee_foreach;

Standard arithmetic operation for integers and floating point numbers are supported in foreach relational operator.

grunt> emp_details = LOAD ’emp’ USING PigStorage(‘,’) as (ename: chararray, eno: int,sal:float,bonus:float,dno:int);

grunt> emp_total_sal = foreach emp_details GENERATE sal+bonus;

grunt> emp_total_sal1 = foreach emp_details GENERATE $2+$3;

emp_total_sal and emp_total_sal1 gives you the same output. References through positions are useful when the schema is unknown or undeclared. Positional references starts from 0 and is preceded by $ symbol.

Range of fields can also be accessed by using double dot (..). For Example:

grunt> emp_details = LOAD ’emp’ USING PigStorage(‘,’) as (ename: chararray, eno: int,sal:float,bonus:float,dno:int);

grunt> beginning = FOREACH emp_details GENERATE ..sal;

The output of the above statement will generate the values for the columns ename, eno, sal.

grunt> middle = FOREACH emp_details GENERATE eno..bonus;

The output of the above statement will generate the values for the columns eno, sal, bonus.

grunt> end = FOREACH emp_details GENERATE bonus..;

The output of the above statement will generate the values for the columns bonus, dno.

Bincond or Boolean Test

The binary conditional operator also referred as “bincond” operator. Let us understand it with the help of an example.

5==5 ? 1:2 It begins with the Boolean test followed by the symbol “?”. If the Boolean condition is true then it will return the first value after “?” otherwise it will return the value which is after the “:”. Here, the Boolean condition is true hence the output will be “1”.

5==6 ? 1:2 Output, in this case, will be “2”.

We have to use projection operator for complex data types. If you reference a key that does not exist in the map, the result is a null. For Example:

student_details = LOAD ‘student’ as (sname:chararray, sclass:chararray, rollnum:int, stud:map[]);

avg = FOREACH student_details GENERATE stud#’student_avg’);

For maps this is # (the hash), followed by the name of the key as a string. Here, ‘student_avg’ is the name of the key and ‘stud’ is the name of the column/field.

Pig FILTER Operator

A filter operator allows you to select required tuples based on the predicate clause. Let us consider the same emp file. Our requirement is to filter the department number (dno) =10 data.

grunt> filter_data = FILTER emp BY dno == 10;

If you will dump the “filter_data” relation, then the output on your screen as below:

mak,101,5000.0,500.0,10

puru,103,6500.0,700.0,10

We can use multiple filters combined together using the Boolean operators “and” and “or”. Pig also uses the regular expression to match the values present in the file. For example: If we want all the records whose ename starts with ‘ma’ then we can use the expression as:

grunt> filter_ma= FILTER emp by ename matches ‘ma.*’;

Since, the filter passes only those values which are ‘true’. It evaluates on the basis of ‘true’ or ‘false’.

It is important to note that if say z==null then the result would be null only which is neither true nor false.

Let us suppose we have values as 1, 8 and null. If the filter is x==8 then the return value will be 8. If the filter is x!=8 then the return value will be 1.

We can see that null is not considered in either case. Therefore, to play around with null values we either use ‘is null’ or ‘is not null’ operator.

Pig GROUP Operator

Pig group operator fundamentally works differently from what we use in SQL.

This basically collects records together in one bag with same key values. In SQL, group by clause creates the group of values which is fed into one or more aggregate function while as in Pig Latin, it just groups all the records together and put it into one bag.

Hence, in Pig Latin there is no direct connection with group and aggregate function.

grunt> emp_details = LOAD ’emp’ USING PigStorage(‘,’) as (ename: chararray, eno: int,sal:float,bonus:float,dno:int);

grunt> grpd = GROUP emp_details BY dno;

grunt> cnt = FOREACH grpd GENERATE group,COUNT(emp_details);

Pig ORDER BY Operator

Pig Order By operator is used to display the result of a relation in sorted order based on one or more fields. For Example:

grunt> Order_by_ename = ORDER emp_details BY ename ASC;

Pig DISTINCT Operator

This is used to remove duplicate records from the file. It doesn’t work on the individual field rather it work on entire records.

grunt> unique_records = distinct emp_details;

Pig LIMIT Operator

Limit allows you to limit the number of records you wanted to display from a file.

grunt> emp_details = LOAD ‘emp’;

grunt> first50 = limit emp_details BY 50;

Pig SAMPLE Operator

Sample operator allows you to get the sample of data from your whole data-set i.e it returns the percentage of rows. It takes the value between 0 and 1. If it is 0.2, then it indicates 20% of the data.

grunt> emp_details = LOAD ‘emp’;

grunt> sample20 = SAMPLE emp_details BY 0.2;

Pig PARALLEL

Pig Parallel command is used for parallel data processing. It is used to set the number of reducers at the operator level.

We can include the PARALLEL clause wherever we have a reducer phase such as DISTINCT, JOIN, GROUP, COGROUP, ORDER BY etc.

For Example: SET DEFAULT_PARALLEL 10; 

Meaning is that all MapReduce jobs that get launched will have 10 parallel reducers running at a time.

It is important to note that parallel only sets the reducer parallelism while as the mapper parallelism is controlled by the MapReduce engine itself.

Pig FLATTEN Operator

Pig Flatten removes the level of nesting for the tuples as well as a bag. For Example: We have a tuple in the form of (1, (2,3)).

GENERATE expression $0 and flatten($1), will transform the tuple as (1,2,3). 

When we un-nest a bag using flatten operator, then it creates tuples. For Example: we have bag as (1,{(2,3),(4,5)}).

GENERATE $0, flatten($1), then we create a tuple as (1,2,3), (1,4,5)

Pig COGROUP Operator

Pig COGROUP operator works same as GROUP operator. The only difference between both is that GROUP operator works with single relation and COGROUP operator is used when we have more than one relation.

Let us suppose we have below two relations with their data sets:

student_details.txt 

101,Kum May,29,9010101010,Bangalore102,

Abh Nig,24,9020202020,Delhi103,

Sum Nig,24,9030303030,Delhi

employee_details.txt 

101,Nancy,22,London102,

Martin,24,Newyork103,

Romi,23,Tokyo

Now, let us try to group the student_details.txt and employee_details.txt records.

grunt> cogroup_final = COGROUP employee_details by age, student_details by age; 

Output as below:

(22, {(101, Nancy, 22, London)}, {})

(23, {(103, Romy, 23, Tokyo)},{})

(24, {(102, Martin, 24, Newyork)}, {(102, Abh Nig, 24, 9020202020, Delhi), (103, Sum Nig, 24, 9030303030, Delhi)})

(29, {}, {(101, Kum May, 29, 9010101010, Bangalore)})

Pig SPLIT Operator

Pig Split operator is used to split a single relation into more than one relation depending upon the condition you will provide.

Let us suppose we have emp_details as one relation. We have to split the relation based on department number (dno). Sample data of emp_details as below:

mak,101,5000.0,500.0,10

ronning,102,6000.0,300.0,20

puru,103,6500.0,700.0,10

jetha,103,6500.0,700.0,30

grunt> SPLIT emp_details into emp_details1 IF dno=10, emp_details2 if (dno=20 OR dno=30);

grunt> DUMP emp_details1;

mak,101,5000.0,500.0,10

puru,103,6500.0,700.0,10

grunt> DUMP emp_details2;

ronning,102,6000.0,300.0,20

jetha,103,6500.0,700.0,30

Pig Latin Introduction – Examples, Pig Data Types | RCV Academy

In the following post, we will learn about Pig Latin and Pig Data types in detail.

Pig Latin Overview

Pig Latin provides a platform to non-java programmer where each processing step results in a new data set or relation.

For example, X = load ’emp’; Here “X” is the name of relation or new data set which is fed from loading the data set “emp”,”X” which is the name of relation is not a variable however it seems to act like a variable.

Once the assignment is done to a given relation say “X”, it is permanent. We can reuse the relation name in other steps as well but it is not advisable to do so because of better script readability purpose.

For Example:

X = load 'emp';
X = filter X by sal > 10000.0;
X = foreach X generate Ename;

Here at each step, the reassignment is not done for “X”, rather a new data set is getting created at each step.

Pig Latin also has a concept of fields or columns. In the above example “sal” and “Ename” is termed as field or column.

It is also important to know that keywords in Apache Pig Latin are not case sensitive.

For example, LOAD is equivalent to load. But the relations and column names are case sensitive.  For example, X = load ’emp’; is not equivalent to x = load ’emp’;

For multi-line comments in the Apache pig scripts, we use “/* … */” and for single-line comment we use “–“.

Pig Data Types

Pig Scalar Data Types

  • Int (signed 32 bit integer)
  • Long (signed 64 bit integer)
  • Float (32 bit floating point)
  • Double (64 bit floating point)
  • Chararray (Character array(String) in UTF-8
  • Bytearray (Binary object)

Pig Complex Data Types

Map

A map is a collection of key-value pairs.

Key-value pairs are separated by the pound sign #. “Key” must be a chararray datatype and should be a unique value while as “value” can be of any datatype.

For example:

[1#Honda, 2#Toyota, 3#Suzuki], [name#Mak, phone#99845,age#29].

Tuple

A tuple is similar to a row in SQL with the fields resembling SQL columns. In other

In other words, we can say that tuples are an ordered set of fields formed by grouping scalar data types. Two consecutive tuples need not have to contain the same number of fields.

For example:

(mak, 29, 4000.0)

Bag

A bag is formed by the collection of tuples. A bag can have duplicate tuples.

If Pig tries to access a field that does not exist, a null value is substituted.

For Example:

({(a),(b)},{},{(c),(d)},{ (mak, 29, 4000.0)}), (BigData, {Hadoop, Mapreduce, Pig, Hive})

NULLS

A null data element in Apache Pig is just same as the SQL null data element. The null value in Apache Pig means the value is unknown.

Apache Pig Installation – Execution, Configuration and Utility Commands

Apache Pig Installation can be done on the local machine or Hadoop cluster. To install Apache Pig, download package from the Apache Pig’s release page here.

You can also download the pig package from Cloudera or Apache’s Maven repository.

Pig does not need the Hadoop cluster for installation however it runs on the machine from where we launch the Hadoop jobs. It can also be installed on your local desktop or laptop for prototyping purpose. In this case, Apache Pig will run in local mode. If your desktop or laptop can access the Hadoop cluster then you can install the pig package there as well.

Pig package is written using JAVA language hence portable to all the operating systems but the pig script uses bash script, so it requires UNIX /LINUX operating system.

Once you have downloaded the pig, place it in the directory of your choice and untar it using below command:

tar –xvf <pig_downloaded_tar_file>

Apache Pig Execution

You can execute pig script in following three ways:

  1. Interactive mode

pig  -x  local <script.pig>

Runs in the single virtual machine and all files are in the local system

  1. Interactive mode in Hadoop File System

pig  -x

Runs in Hadoop cluster, it is the default mode

  1. Script mode

pig  -x  local
pig  myscript .pig

Script is a text file can be run in local or MapReduce mode

Command Line and their configurations

Pig provides a wide variety of command line option. Below are few of them:

-h or –help

It will list all the available command line options.

-e or –execute

If you want to execute a single command through pig then you can use this option. e.g. pig –e fs –ls will list the home directory.

P or –propertyfile

It is used to specify a property file that a pig script should read.

The below tabular chart shows the return codes used by pig along with their description.

ValueDescription
0Success
1Retriable failure
2Failure
3Partial failure – Used with multi-query
4Illegal arguments passed to Pig
5IOException thrown – thrown usually by a UDF
6PigException thrown – thrown usually by Python UDF.
7ParseException thrown – in case of variable substitution
8an unexpected exception

Grunt

The interactive shell name of Apache Pig is called Grunt. It provides the shell for users to interact with HDFS using PigLatin.

Once you enter the below command on the Unix env where Pig package is installed

pig –x local

The output will be:

grunt>

To exit the grunt shell you can type ‘quit’ or Ctrl-D.

HDFS command in the grunt shell can be accessed using keyword ‘fs’. Dash(-) is the mandatory part when Hadoop fs is used.

grunt>fs -ls

Utility Commands for controlling Pig from grunt shell

Kill jobid

You can find the job’s ID by looking at the Hadoop’s Job Tracker GUI. The above command can be used to kill a Pig job based on the job id.

exec

exec command to run a Pig script in batch mode with no interaction between the script and the Grunt shell.

Example as below:

grunt> exec script.pig

grunt> exec –param p1=myparam1 –param p2=myparam2 script.pig

run

Issuing a run command on the grunt shell has basically the same effect as typing the statements manually.

Run and exec commands are useful for debugging because you can modify a Pig script in an editor and then rerun the script in the Grunt shell without leaving the shell.