/*** TIP 00385 ***/
 I work in a psychiatric hospital and I have a file that contains
 medication type and dosage in a single field. I need to separate out
 dosage from med as best I can, but the location of dosage info is
 irregular, and so SUBSTR won't do. A couple of examples are:
SEROQUEL TABS 100 MG 
SEROQUEL TABS 25MG (no space here) 
INJECTION: HALDOL DECANOATE 100 MG/ML 
INJECTION: HALOPERIDOL 2 MG/ML 

I suggest looking at a large sample of values before choosing the following algorithms and perhaps a freq check of the resulting fields. Ian Whitlock I concur with Ian - KNOW THY DATA and Check your Results! Charles Patridge
In the examples you show, you could use the INDEXC function to find the location of the first numeral, and then use the SUBSTR function to extract all the text before that. Solution #1 by Jack Hamilton & Arthur Tabachneck data test; infile cards; input @1 medication $50.; cards; SEROQUEL TABS 100 MG SEROQUEL TABS 25MG (no space here) INJECTION: HALDOL DECANOATE 100 MG/ML INJECTION: HALOPERIDOL 2 MG/ML ; run; data drugs; set test; dossage = substr(medication, indexc(medication, '0123456789')); medication = substr(medication, 1, indexc(medication, '0123456789')-1); put / medication= / dossage= /; run;
Solution #2 by Ron Fehd ... continuation from above example: data test2; set test; TypeDosage = medication; MG = index(TypeDosage,'MG');*returns column of MG; if MG then do; do I = MG-2 to 1 by -1 until(Space = ' '); Space = substr(TypeDosage,I,1);end; Dosage = input(substr(TypeDosage,I+1,MG-I-1),4.);*max dosage: 9999; end; else Dosage = .; put dosage; run; /*** end of tip 00385 ***/