The trick here is to create a random variable, sort the dataset by that random variable, and then assign the observations to the groups. Let’s use the hsb2 dataset as an example by randomly assigning 50 observations to each of four groups.
use https://stats.idre.ucla.edu/stat/stata/notes/hsb2, clear set seed 12345 generate rannum = uniform() sort rannum generate grp = . replace grp = 0 in 1/50 replace grp = 1 in 51/100 replace grp = 2 in 101/150 replace grp = 3 in 151/200 tabulate grp grp | Freq. Percent Cum. ------------+----------------------------------- 0 | 50 25.00 25.00 1 | 50 25.00 50.00 2 | 50 25.00 75.00 3 | 50 25.00 100.00 ------------+----------------------------------- Total | 200 100.00 sort id clist id grp in 1/20 id grp 1. 1 0 2. 2 3 3. 3 2 4. 4 1 5. 5 0 6. 6 3 7. 7 1 8. 8 2 9. 9 0 10. 10 0 11. 11 1 12. 12 0 13. 13 3 14. 14 0 15. 15 3 16. 16 3 17. 17 3 18. 18 1 19. 19 3 20. 20 3
Of course, when you try this the grp number for each id will be in a different pattern because we are using a random process to assign observations to groups.
It is possible to make the code even simpler then the above by using the egen , cut() command.
use https://stats.idre.ucla.edu/stat/stata/notes/hsb2, clear generate rannum = uniform() egen grp2 = cut(rannum), group(4) sort id list id grp2 in 1/20 id grp2 1. 1 0 2. 2 3 3. 3 2 4. 4 1 5. 5 0 6. 6 3 7. 7 1 8. 8 2 9. 9 0 10. 10 0 11. 11 1 12. 12 0 13. 13 3 14. 14 0 15. 15 3 16. 16 3 17. 17 3 18. 18 1 19. 19 3 20. 20 3
For more information see the Stata manual or Stata Help for functions.