[an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] (none) [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] (none) [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive]
 
[an error occurred while processing this directive] [an error occurred while processing this directive]
Skåne Sjælland Linux User Group - http://www.sslug.dk Home   Subscribe   Mail Archive   Forum   Calendar   Search
MhonArc Date: [Date Prev] [Date Index] [Date Next]   Thread: [Date Prev] [Thread Index] [Date Next]   MhonArc
 

Re: [PROGRAMMERING] Udtræk af url'er fra ASCII fil



On 14/07/2013, at 21.18, Jens Bang <sslug@sslug> wrote:

> 
>> men umiddelbart burde følgende python nok kunne klare det
>> 
>> https://gist.github.com/jinie/5992705
>> 
>> ==============================
>> #!/usr/bin/env python
>> import re
>> import sys
>> 
>> ex = re.compile("\"(url|referer)\"\:\"(.*)\"")
>> with open(sys.argv[1]) as f:
>>    for line in iter(f.readline,""):
>>        m = ex.search(line)
>>        print(m.group(2))
>> ==============================
> 
> Denne så ud til at den ville have virket, hvis ikke den havde troet at der ville være linieskift i filen.



Det blev lidt mere kompliceret, men hvem siger nej til en udfordring, og så er den da til næste gang :)

https://gist.github.com/jinie/5996642

==============================
import re
import sys
from functools import partial

remainder = ""
bufsize=4096
ex = re.compile('"(?:url|referer)"\:"([^"]*)"')

with open(sys.argv[1]) as f:
    for buf in iter(partial(f.read,bufsize),""):
        buf = remainder + buf
        remainder = ""
        endpos = 0
        for m in ex.finditer(buf):
            print m.group(1)
            endpos = m.end()
        remainder = buf[endpos:]
==============================

/Jimmy


 
Home   Subscribe   Mail Archive   Index   Calendar   Search

 
 
Questions about the web-pages to <www_admin>. Last modified 2013-08-01, 02:05 CEST [an error occurred while processing this directive]
This page is maintained by [an error occurred while processing this directive]MHonArc [an error occurred while processing this directive] # [an error occurred while processing this directive] *