Data Management & Analysis: Data Transformations
These operations are used to compute functions of your data and to create new variables. Data may either be imported into the environment or may be created internally using the random number generators.
Command Structures
- Conditional: If(...), Else, Then...
- Parentheses to any number of levels
- Implied multiplication of grouped variables and expressions
Algebraic Transformations
- Standard operators: + - * / ^ (power)
- Binary variable operators: x > 1 = 1(x > 1) likewise for <, >= etc.
- Comparison operators: x ! y = max(x,y), x ~ y = ,in(x,y)
- Expand categorical variable into a set of dummy variables
Functions
- Log, Exp, Abs, Sqr, Sin, Rsn (arcsin), Cos, Rcs (arccos), Tan
- Gamma, digamma, trigamma, log gamma, beta
- Sign, fix (round to nearest integer), integer part
- Box-Cox transformation and derivatives of Box-Cox
Random Number Generation
- Continuous random variables, uniform, normal, lognormal, t, chi-squared, F, exponential, Weibull, Gumbel, gamma, beta, logistic, Cauchy, truncated standard normal
- Discrete random variables: Poisson, discrete uniform, binomial, geometric
- Halton sequences
Probability Distributions
- Logistic: logit function, density, cdf
- Univariate normal: density, cdf, truncated means and variances, sample selection ‘lambda,’ inverse normal cdf, inverse cdf to pdf
- Bivariate normal: pdf, cdf, partial derivatives
- Multivariate normal: cdf
- Means, deviations, standardized variables
- Panel data, group means and deviations
- Rowwise moments of a set of variables
- Multiple of a set of variables
Recode ranges of values or recode specific values to discrete values
Sort ascending or descending (carry other variables and/or labels)
Trends and seasonal dummy variables
Stratification variables and period variables for panels
Leads and lags
Matrix functions: dot products, quadratic forms
Sample statistical transformations