Categories

# How to Convert String Variables With Non-Numeric Values to Numeric Variables in Stata

String variables with non-numeric numeric values can be converted to numeric variables in Stata using the encode or egen commands.

Many survey questionnaires use a Likert or Likert-like scale, e.g.:

1. Strongly Agree
2. Agree
3. Neutral
4. Disagree
5. Strongly Disagree

or

1. Always
2. Usually
4. Seldom
5. Never

Below is another example of non-numeric values in a variable:

1. A
2. B
3. C
4. D
5. E

When analyzing data, it is often desirable to have numeric values (e.g. 0, 1, 2, 3, 4 or 1, 2, 3, 4, 5) instead of non-numeric ones. Stata recognizes these non-numeric values as “string” values and their variables are called “string variables.”

In Stata, there are a few ways of converting string variables (with non-numeric values) to numeric variables (with numeric values). The commonest way to achieve this is probably by using the `encode` command, i.e.:

``. encode oldvar, generate(newvar)``

where `oldvar` is the name of the old variable and `newvar` is the name of the new variable. The new numeric variable will have value labels added to it if the `encode` command is used.

Another way of doing the same thing is by using the `egen` command, i.e.:

``. egen newvar = group(oldvar)``

The new variable will have numeric values without value labels.

## 3 replies on “How to Convert String Variables With Non-Numeric Values to Numeric Variables in Stata” shanthasays:

Dear Dr. Andy;
When I was searching about how to convert string variables in to numeric variables in stata, I found your document. It was really helpful for me. Thank you so much sharing your knowledge with others.

Warm Regards,
Shantha Andy Tehsays:

@Shanta – You’re welcome. I’m glad I could help! 🙂 Anamikasays:

Respected sir,
i have non-numeric code in a variable in stata and i want to rename that non-numeric code (under that particular variable) into a numeric value. please tell me, how can i solve my problem?