Windows Grep 2.2 provides the ability to search delimited and fixed-width data files at a field level. Such files are usually data files of some sort where each line is formatted the same.
Searching delimited lists is easy.
Here is an example delimited text file: (the first few lines of the California birth index for 1905)
AARON,MURIEL,J,09/12/1905,FEMALE,MORRIS,SAN FRANCISCO AASNAES,CARL,O,10/25/1905,MALE,STOVENON,SAN FRANCISCO ABADICH,GUNTHER, ,10/05/1905,MALE,SCHONGEN,ALAMEDA ABBAY,THELMA, ,12/07/1905,FEMALE,GRISWALD,SAN FRANCISCO ABBOTT, , ,08/10/1905,FEMALE,WELLS,SAN FRANCISCO ABBOTT,CORA, ,10/22/1905,FEMALE,GERLACH,SAN FRANCISCO ABE,SHOICHI, ,06/12/1905,MALE,YAMASAKI,SAN FRANCISCO
Each line consists of 7 fields separated by commas. The first one is the child’s surname, and the last, field 7, is the place of birth.
If a search is made on this file using the text file format setting of Normal, searches for particular names will be done at a line-level resulting in the possibility of false matches.
A search for “Jose” would find births with surnames Jose, mothers with a maiden name Jose and births in San Jose which probably wouldn’t be what’s required.
If the text file format is set to Delimited list, it is possible to specify which field to search in, and also the field separator character (a comma in this case, but is also commonly a tab or a semicolon)
So, to find births to mothers with a maiden name Jose, select field 6.
To find Jose as the child’s surname, select field 1.
Regular expressions and soundex searches can be used at a field-level. (Soundex is particularly useful for searching for names where you are not sure of the spelling).
Searching fixed-width lists is not so easy.
Here is an example of a fixed width list, the same information as above, but formatted differently. Notice how the fields line up vertically in the file.
AARON MURIEL J 09/12/1905 FEMALE MORRIS SAN FRANCISCO AASNAES CARL O 10/25/1905 MALE STOVENON SAN FRANCISCO ABADICH GUNTHER 10/05/1905 MALE SCHONGEN ALAMEDA ABBAY THELMA 12/07/1905 FEMALE GRISWALD SAN FRANCISCO ABBOTT 08/10/1905 FEMALE WELLS SAN FRANCISCO ABBOTT CORA 10/22/1905 FEMALE GERLACH SAN FRANCISCO ABE SHOICHI 06/12/1905 MALE YAMASAKI SAN FRANCISCO
In order to search individual fields, it is necessary to know the starting column of the field of interest and its width. A good way of working this out is by typing a ruler line above the data i.e.
1 2 3 4 5 6 ....!....|....!....|....!....|....!....|....!....|....!....|.... AARON MURIEL J 09/12/1905 FEMALE MORRIS SAN FRANCISCO
This time, to search in the 7th field, for example, you need to specify a start position is 52 and a field width of 13, while to search the 2nd field, specify a start position of 10 and a width of 8
Regular expressions and soundex searches can be also be used at a field-level in these types of file.
A common use of Windows Grep is for searching the California birth and death indexes. The birth index is a comma delimited file, while the death index is fixed width. I have written a tutorial on how to use it for this purpose at my website (Hyperlink on the About box)