parallel parsing algorithms needed

I need some guidance on how to parallelize parsing process. My data set are http requests from raw tcp packets. This is an example of 2 records:

record1:

GET /jobs/companies/ad/impression?a=dYH81OxGNzEUQBdC HTTP/1.1
Host: stackoverflow.com
Connection: keep-alive
Accept: image/webp,image/*,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Referer: http://stackoverflow.com/jobs
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,ru;q=0.6
Cookie: __cfduid=d52062217ef00a64815f1d547ae2cc3ca1452608519; prov=73d727bd-8139-4de2-bef1-79fab3f63e75; __qca=P0-1147719590-1452608520281; cc=a855107760874fcc9ddf67109fdbb069; csous$

record 2:

GET /UiHVAm.jpg HTTP/1.1
Host: i.stack.imgur.com
Connection: keep-alive
Accept: image/webp,image/*,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Referer: http://stackoverflow.com/jobs
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,ru;q=0.6
Cookie: __cfduid=dda24824cb6f039994a07219ab0ee40131452607011

The first line is a request, and then pairs of field-value follow.
What I was thinking is identify the positions of new line characters first inside the whole request text. Then launch threads that identify each header field (the left part from the ‘:’ character, for example: ‘Host’, there are about 40 different header fields) and the data will be on the right side up to the end of the line. But, then I thought, if I already made so much compare operations to find the new line character I could already identify the header fields… this made me doubt if my algorithm was efficient.
Searching on google there are some papers on parsing but I am not sure how much progress in this field has been done, and what would be the fastest parser for my case ?

I appreciate very much your comments.
Regards
Nulik

The forum archive shows at least one discussion on parallel parsing:

https://devtalk.nvidia.com/default/topic/517642/

Your examples are ~512 bytes so a small and simple algorithm might be your best bet.