Many survey questionnaires use a Likert or Likert-like scale, e.g.:
- Strongly Agree
- Agree
- Neutral
- Disagree
- Strongly Disagree
or
- Always
- Usually
- About Half the Time
- Seldom
- Never
Below is another example of non-numeric values in a variable:
- A
- B
- C
- D
- E
When analyzing data, it is often desirable to have numeric values (e.g. 0, 1, 2, 3, 4 or 1, 2, 3, 4, 5) instead of non-numeric ones. Stata recognizes these non-numeric values as “string” values and their variables are called “string variables.”
In Stata, there are a few ways of converting string variables (with non-numeric values) to numeric variables (with numeric values). The commonest way to achieve this is probably by using the encode
command, i.e.:
. encode oldvar, generate(newvar)
where oldvar
is the name of the old variable and newvar
is the name of the new variable. The new numeric variable will have value labels added to it if the encode
command is used.
Another way of doing the same thing is by using the egen
command, i.e.:
. egen newvar = group(oldvar)
The new variable will have numeric values without value labels.
Dear Dr. Andy;
When I was searching about how to convert string variables in to numeric variables in stata, I found your document. It was really helpful for me. Thank you so much sharing your knowledge with others.
Warm Regards,
Shantha
@Shanta – You’re welcome. I’m glad I could help! 🙂
Respected sir,
i have non-numeric code in a variable in stata and i want to rename that non-numeric code (under that particular variable) into a numeric value. please tell me, how can i solve my problem?