IPv6 site snapshot

Martin Millnert martin at millnert.se
Wed Apr 27 01:55:10 CEST 2011


Jared,

On Tue, 2011-04-26 at 19:34 -0400, Jared Mauch wrote:
> [last update, i really need to attend to family items, see below]
> 
> > dns_names.txt contains URLs and thus some overlap of DNS names.
> > The following purifies:
> > 
> > anticimex at shakira:/dev/shm$ wget -q
> > http://puck.nether.net/~jared/aaaa/dns_names.txt.gz
> > anticimex at shakira:/dev/shm$ gunzip dns_names.txt.gz
> > anticimex at shakira:/dev/shm$ wc -l dns_names.txt
> > 1000000 dns_names.txt
> > anticimex at shakira:/dev/shm$ sed -i 's/\/.*//' dns_names.txt | sort -u >
> > dns_names_clean_sorted.txt
> > anticimex at shakira:/dev/shm$ wc -l dns_names_clean_sorted.txt
> > 991944 dns_names_clean_sorted.txt
> 
> Yeah, I noticed this as well and cleaned up some of the trailing data.  I've not re-run it yet, but hope to soon.

Ok.

<snip>

> I do have a dig running against the full list and will leave that running overnight and see how far that gets.  I'm glad there is some interest in this, and hope to report more results sometime tomorrow.  It's processed 29k hosts of the ~1000k hosts in 2 hours.  Not that promising for a quick result, but maybe the dns servers will become less loaded overnight..

I pulled down 20k hosts in 10 mins on a real crappy overloaded dsl
connection with my script above on my laptop. It was using maybe 50 kBps
or so for the resolution, which was done using a really slow Mikrotik
box as a resolver (RB750). Pretty much war zone conditions for this type
of job. :)

I imagine the million should be doable in under an hour on a decent
server with decent connectivity (running a scalable resolver on lo, say)
with my stupid script.  With 100x invested time in coding, the entire
job processing time should approach bounds set out by the various DNS
server operators out there...

Thanks for going at this data, obviously I and others find it
interesting. :)

Regards,
Martin

(Now let's see if there is some DNS-resolution code for Erlang, and what
happens when you try to resolve 2*1M records, all in parallel, from a
location where bandwidth is not the largest limiting factor :) )



More information about the ipv6-ops mailing list