hpr4135 :: Mining the web

In this episode I talk a bit about a project I have been working on to index the web.

Hosted by Cedric De Vroey on Friday, 2024-06-07 is flagged as Explicit and is released under a CC-BY-SA license.
docker, redis, hacking, mongodb, scraping, dns, certificate-transparency. 4.

Duration: 00:15:14
Comment #1 posted on 2024-06-12 12:00:27 by norrist

Clever use of transparency data

Using the transparency logs is a clever way to get a list of active domains. Lets Encrypt certs expire every 90 days, so the domains are more likely to be active. I don't have any suggestions for managing the data, but it sounds like a solvable problem.

Can you post more about how you are parsing the transparency logs?

Comment #2 posted on 2024-06-14 11:34:02 by Henrik Hemrin

Amazing project

What an amazing hacker project. I'm impressed and nice to learn about.

Comment #3 posted on 2024-07-28 02:30:42 by hobs

Federated DB of domains?

Would be really cool to share our domain lists with each other, sort-of like how pfsense shares black lists and whitelists. We could use a federated and reductant db to reduce resource costs and improve performance. Would also be good to ping subdomains randomly, from different IPs to avoid being shaddowbanned and missing some important subdomains. Anyone have any open source code they use for this kind of subdomain enumeration work.

Comment #4 posted on 2024-08-01 10:02:15 by Cedric

Follow up

Been some time since I posted this and I didn't notice these comments before.

@Norris: I will see to make a follow up this month to get a bit deeper into how stuff was build

@Henrik Hemrin: Thanks for your support!

@Hobs: there are a number of alternatives available for this. Most famous is probably "dnsenum" which also gets shipped with Kali.

