How to Convert String Variables With Non-Numeric Values to Numeric Variables in Stata

We can convert string variables with non-numeric values to numeric variables in Stata using the encode or egen commands.

September 7, 2012 Stata 3 comments

Many survey questionnaires use a Likert or Likert-like scale, e.g.:

Strongly Agree
Agree
Neutral
Disagree
Strongly Disagree

Always
Usually
About Half the Time
Seldom
Never

Below is another example of non-numeric values in a variable:

When analyzing data, it is often desirable to have numeric values (e.g., 0, 1, 2, 3, 4 or 1, 2, 3, 4, 5) instead of non-numeric ones. Stata recognizes these non-numeric values as “string” values, and their variables are called “string variables.”

In Stata, there are a few ways of converting string variables (with non-numeric values) to numeric variables (with numeric values). The commonest way to achieve this is probably by using the encode command, i.e.:

. encode oldvar, generate(newvar)

where oldvar is the name of the old variable and newvar is the name of the new variable. If we use the encode command, the new numeric variable will have value labels added to it.

Another way of doing the same thing is by using the egen command, i.e.:

. egen newvar = group(oldvar)

The new variable will have numeric values without value labels.

3 comments add your comment

shantha
March 17, 2014 Reply
Dear Dr. Andy;
When I was searching about how to convert string variables in to numeric variables in stata, I found your document. It was really helpful for me. Thank you so much sharing your knowledge with others.
Warm Regards,
Shantha
- Dr Andy Teh
  March 18, 2014 Reply
  @Shanta – You’re welcome. I’m glad I could help! 🙂
Anamika
December 8, 2015 Reply
Respected sir,
i have non-numeric code in a variable in stata and i want to rename that non-numeric code (under that particular variable) into a numeric value. please tell me, how can i solve my problem?

Related Posts

3 comments add your comment

Leave a Comment Cancel