For those a bit confused: This subdomain redirects to https://beta.shodan.io/host/$YOUR_REMOTE_ADDR. This runs a search against their existing database of information. Shodan has an army of bots that crawl the entire Internet and stores what it finds along the way. Folks generally pay Shodan for access to these notes.
The site returns various HTTP error codes based on the results of that lookup, or shows a fancier page with open ports and other information it has on that IP address. (Example: https://beta.shodan.io/host/1.1.1.1)
There is no active scan occurring here. (But you could be hinting to Shodan that these particular IPs are valid though!)
>(I wouldn't rule out that you may be hinting to Shodan that these particular IPs are valid though!)
For those of us on IPv4, eh. Only 4 billion addresses, and with a lot of that itself tied up in various large /8s to a few specific organizations many of which can be assumed to be beyond casual level, it's just no longer a real haul to scan everything all the time at a basic level. Plus for those of us browsing from our main address we're leaving a trail all over the web anyway through a host of poorly secured servers. So I while I don't disagree it's worth thinking about information leaks and honey pots and the like whenever dealing with infosec in any way, in this specific case I also don't think this reveals anything of significant value.
We don't retain web logs and the way users interact with Shodan doesn't change the way Shodan crawls the Internet. I.e. using the website/ API doesn't change how we look at the Internet.
I'm still confused, because "curl -D - https://beta.shodan.io/host/$ALIP" where $ALIP is the IP address of my Amazon Lightsail instance, gives different results depending on where I do it from.
If I do that query from home, it correctly tells me that 22, 80, and 443 are open. If I do that query from the Lightsail instance itself, I get 404. I also get 404 if I do it from work.
Strange! Maybe they have Amazon compute resources on a list and respond differently to avoid abuse? If they don't pop in here, maybe ping @shodanhq on Twitter.
I've had the same static ipv4 address for just over 19 months, and have been running an exposed OpenVPN server on an alternative port on that IP for the entire duration. I get no results.
Color me unimpressed that "an army of bots that crawl the entire Internet and stores what it finds along the way" hasn't port scanned every ipv4 address at least once in the span of 19+ months. That is a very long time for a limited address space.
tldr; There is no "army", or that army is armed with pool noodles.
It also depends how you've configured OpenVPN. If you've properly configured to drop connections that didn't provide a valid certificate then even running it on a regular port would make it invisible to scanners.
I believe the recommended configuration is to run OpenVPN via UDP and only accept connections from trusted certificates. If you're running it on TCP then a scanner would be able to see that you have an open port but still can't see what's running on it.
It does, but what you see is not a portscan, it is a lookup from the Shodan database and they store information about known services on the public Internet.
Well, they've found a port 22 that I have open. Oh noes! Actually, this seems to undercut the whole "don't use standard ports" argument. They would have found it as well if ssh were at 22222 instead.
> Shodan has servers located around the world that crawl the Internet 24/7 to provide the latest Internet intelligence.
That needs to sink in for anyone ever allowing themselves to believe the fallacy that they can slip under the radar with a security vulnerability or sleeping soundly with security by obfuscation. You aren't a computer port hiding on one specific computer on the internet, you are data trying to hide in a relational database.
Does IPv6 reduce the feasibility of full Web port scans? If so that to me would be a compelling reason to use IPv6 beyond “it’s the right thing to do”.
Not really. I scan the internet in a similar fashion to Shodan and have found some promising methods to do host discovery. You obviously can't guarantee 100% coverage, but you can get a reasonable percentage without having to do an exhaustive incremental scan.
> You obviously can't guarantee 100% coverage, but you can get a reasonable percentage without having to do an exhaustive incremental scan.
Depends on what you mean by "a reasonable percentage" there.
If you can scan all of 2^32 addresses in IPv4 in ~5mins (as suggested elsewhere in this thread), then it'd take something like 75,000,000,000,000,000,000,000 years to do the same for all 2^128 addresses in IPv6.
It's possible you've got a technique to find "a reasonable percentage" of all devices listening/responding on IPv6 addresses, but unless 10^-22% is "a reasonable percentage" - then no, you can't randomly portscan IPv6 and ever even realistically expect to connect to anything at all, never mind come up with some Shonan-Like map of almost the entire address space.
This mostly just moves the problem from brute forcing into dictionary attacks though, similar to how passwords get mostly attacked these days. Any IPv6 address that's doing anything on the internet leaves traces of it's existence/activity somewhere. I'm guessing there are people popping log files and monitoring major traffic interchanges, and creating their own haveibeenpwned-style lists of IPv6 addresses that're actually in use, then selectively scanning them and probably the closely related subnets of them. If your "promising methods of host discovery" extend radially beyond that, I'd be super interested if you're prepared to share them?
The reasonable percentage is still relatively high because while the number of addresses goes up stupid fast, the number of hosts connected to the internet does not. The distribution of hosts across allocated blocks is also not random, dhcp6 implementations have predictable allocators so they don't have to keep 4 bn records in memory.
Almost all machines are dual homed, which gives you opportunities for them to leak their v6 address over v4. Once you find a statistically significant sample of addresses you can figure out how each network allocates addresses and then scan them until you start turning up no new hits. For example on my v6 blocks I just use the same last octet as their v4 addresses.
You can buy "passive DNS" which is anonymous records of DNS queries and their answers.
So e.g. you can see that the answer for www.google.com was 2a00:1450:4009:81a::2004 for somebody at about 0200 UTC today, but the people providing this data don't provide (and in some cases may not know or be contractually obligated never to tell) "who" asked that question and got that answer.
So this is pretty useful if you're trying to figure out which DNS names exist (as a startup I worked for were doing) and if they have AAAA record then you get all those records.
If you've got a deliberately public IPv6 server it's very likely it can be found by this sort of method.
Do your maths again though. The smallest allowed conventional subnet in IPv6 is 64 bits wide. So surveying "closely related subnets" is 4 billion times more work than surveying the entire IPv4 Internet.
If there's a machine with IPv6 privacy addressing on a "closely related subnet" you're just never going to find it by brute force.
I wonder what percentage of ip addesses have a dns name that ever gets looked up? I seriously doubt my residential internet connection’s isp-supplied dns name ever gets queried... And I wonder if the numbers change there for ipv4 vs ipv6?
The people who run shodan.io are trying to find workarounds to mitigate that. They were caught a few years ago adding their servers to NTP server pools, and scanning any IPv6 addresses that connected to their NTP server.
All of the NTP servers had hostnames ending in "shodan.io". The pool account itself was registered using a shodan.io email address. I told the pool operator what we were doing. And I answered anybody who emailed me asking about it (including Brad before he wrote a blog post pretending that he figured it out by himself). At no point were we trying to hide this activity and we made no attempts to do so. I understand that to the end-user this was an unexpected way to discover IPv6 but we really weren't trying to hide it and Brad conveniently didn't mention in his blog post that I told him what we were doing.
It seems like splitting hairs to say that you didn't try to hide it, but most end-users didn't know that it was being done. It'd be like me recording my neighbors through their window and claiming there is nothing wrong with it because I haven't tried to hide it from them, even though I also didn't let them know I was doing it.
I don't think that's a good comparison but I doubt we'll agree on it and I'm obviously biased on the matter. I hope the additional context will help readers make their own decision.
By "the Internet" they mean the IPv4 space, right? There are only 3.681 billion public IPv4 addresses so it's a trivial problem to scan them all at a suitably parallel scale.
They're working on scanning IPv6 as well. They got in trouble a few years back after they were observed harvesting IPv6 addresses by running a public NTP server[1].
Searching Ghodan for ssh server that are not on port 22 probably gives you back a venn diagram containing circles for "people who thing security by obscurity works" and "people who think their stuff is important enough to 'hide' by configuring non standard port numbers".
The intersection there probably has some interesting low hanging fruit in it...
(There's a third circle in that venn diagram which I sometimes sit in, labeled "people who change port numbers to keep log file noise lower", which wile maybe being a valid choice, also opens you up to being thought of as "interesting possibly low hanging fruit" by the sort of people who thing those things.)
You can probably get a pretty good idea of the v6 space by checking domain name registrations, certificate transparency, logging requests from v6 addresses, etc.
I guess this can be useful if you don't know what's going on with your internet connection.
Although the fact that the "results" are cached arguably makes this less useful than GRC and other sites that will actively scan your IP (range).
And even though they provide a timestamp for the reported information (different for different ports by 24 hours or so, at least for me), I'd personally prefer an active scan over a database lookup if I want to know what's going on.
I'll go further and say that while the information provided is absolutely useful, depending on the services you're exposing to the Internet, there are other tools that will give you much more useful, actionable information.
There are rafts of online tools to check for vulnerabilities in specific services. Notably:
And many others, including the GRC Shields Up! port scanner mentioned in other comments in this thread:
https://www.grc.com/shieldsup
As such, unless you're going to use Shodan services, or want to know what information they have about you, it seems like there are other, better tools out there.
What's more there are tons of tools that you can run locally that will provide much more information about the devices on your internal network, since you can run them inside your firewall.
N.B.: I am emphatically not discouraging others from using shodan.io, nor am I claiming that it's bad. Rather, I'm expressing my own opinion as to how I prefer to identify and test my internet-facing attack surface.
Btw we do also offer the ability to request scans (https://help.shodan.io/the-basics/on-demand-scanning) but it's only available w/ a paying account (including the membership which is a one-time payment of $49).
It's not generally used to assess your own attack surface - it's mainly used to assess the attack surface of others using their search syntax. It's a Google of vulnerable systems.
>Mozilla's "include my site in the public results" (including vulnerabilities) by default doesn't seem very privacy respecting.
Then don't use it.
Or check the click-box. Which is what I did.
That said, your Internet-facing IP address isn't private. In fact, it has to be public in order to route traffic to/from you.
I'd note that the shodan.io site had information about my IP address, even though I'd never used it or requested a scan. What's more, I'm included in that database without any opportunity to opt out as the Mozilla site provides.
And just to be clear, I have no connection to Mozilla (or any of the other sites I mentioned). In fact, I'd never looked at Mozilla Observatory before I started poking around for the comment to which you replied and included it only because it had an SSH scanner.
So, as someone else mentioned, it appears to show historical records, I'm not sure what the TTL on these is, but some of the open ports on my dynamic IP are from previous "owners" of the IP, aren't actually open any more.
For the HTTP/S ports, it displays the response headers, including timestamp, so I went through my access logs and found the record, if anyone is curious:
This is timely. I just found out, last night, that my shitty ISP router was exposing the management interface to the whole internet. It was dumb luck that I stumbled across it. I had to port forward 80/443 to nowhere in order to make it stop. Time to get a dumb modem.
I collaborated with Shodan at a previous job and loved working with them. They've built a really solid product that has come a long way. It's a great story of a side project evolving into a massively important resource.
Told me I have port 1723 open, which was a surprise. But it did NOT tell me that I have port 22 open, which I do. False negatives are a serious problem for a service like this, much more serious than false positives.
Have you had port 22 open for a while, and continuously? It's using cached results not an active scan. False negatives though would definitely be worse in this application.
Also at least for me it shows nothing while I do have ports open, but that's because I whitelist limited IP ranges or single IPs for ports rather than just opening them up to the net in general. I have a VPS Wireguard bastion I bounce through for remote LAN access. That itself is a good reminder though that it's a limited tool, if a system in my whitelisted range was compromised it'd suddenly have more options, and conversely if I already had something lazy or malicious (maybe IOT, compromise or both) running on my network that was being careful about what it talked to a port scan alone wouldn't necessarily root it out.
Still a potentially useful high level pass for low effort, could make one aware of some surprise devices or fat finger mistakes or the like. But "If you see 404 page, nothing is exposed" is overstating it.
Yeah, it's definitely missing some stuff, at least. I have a port open for WireGuard VPN traffic that it completely misses, but that's UDP so maybe that's why.
It shouldn’t be able to find WireGuard ports. WireGuard drops all traffic that isn’t from a key it trusts, so it’s impossible to tell if you have WireGuard running on a port.
This seems to be largely useless if you have a dynamic IP. According to the scan, I have FTP and HTTP open, but I just double checked, and that is definitely not the case.
That is unsurprising. It doesn't check in real time, it checks periodically and caches the results. But my ssh is obviously running continuously so it really should have caught it.
It did catch port 22 on another IP address that I have.
Wow, I haven't thought about GRC in a long time. I've always thought the guy was somewhat of a crackpot/sensationalist/self-promoter because he seemed to have a standard recipe of finding some interesting-to-him feature/aspect of a system and then declaring it a glaring security problem and then writing sensationalized screeds about how everyone needs to use his utilities to protect themselves from it (I swear, he probably shouted about the world ending because of XP raw sockets for a decade longer than it was even relevant). I kind of liked SQRL as a concept but I also remember thinking, "ugh, but it's this guy". He seems to have a decent amount of technical knowledge but a lot of what he writes just seems to be about a need to be considered an expert at something, aimed at people who don't know any better. Just as an example, the idea of people referring to his web page [1] to validate the cert fingerprints of popular websites is ... really bizarre from a security standpoint. I understand him having concern (as do many of us) about the security of the CA hierarchy, but where did he get the fingerprints? How did he validate them? And why should anyone trust him to have done so? Is his web page more secure than the CAs?
... but I digress. Apparently I have some very strong skepticism of Steve Gibson that I wasn't even aware of until I had a visceral reaction to ShieldsUp! (which is probably a perfectly fine service) :D
The word that comes to mind for Steve Gibson is "kook". Not in a nasty way, just the sort of person who keeps sending purported proofs of the abc conjecture to a University mathematics professor they met briefly twenty years ago, or who frequently writes to their senator with opinions about a show they saw on TV. Harmless yet somehow mildly annoying.
But as to that fingerprints page. The idea isn't really about any problem with the Web PKI (what you're calling "the CA hierarchy"). Instead it's about a method you can try to use to figure out if you're using a client that's configured to trust something else entirely. In most cases this means a middlebox, maybe at a school (to stop kids looking at porn and maybe cheating on tests) or at work (porn again, and maybe other non-work stuff, sometimes anti-exfiltration technology) but it could be your own AV software.
It can't really do what it says on the tin, for a few reasons. You're going to get some false alerts this way because the certificate Steve (and Steve's web site) sees won't necessarily match the one you see, and if a sophisticated adversary was trying to defeat Steve's system they could just use the same technique to replace Steve's own site.
Also as often happens with these little projects, it isn't properly maintained. The stuff about EV in Firefox (and I think Chrome too?) is no longer true.
Ah, you're right, I didn't actually read through the alarmist stuff to see that he was referring to MITM proxies that actually install their own roots on your machines. Now I'm amused at the idea that there could be some off the shelf MITM proxy out there that simply does content editing to replace any occurrences of the FP of the real cert they're MITM'ing with the FP of the cert they generated for you on the fly. :D
When I saw the alarmist stuff about knowing the cert fingerprints for certain sites a priori, I thought he'd invented a crazy ad hoc version of cert pinning and just rolled my eyes.
The removal of raw sockets in XP SP2 did not prove Steve Gibson anything, as he had predicted the internet would become unusable with raw sockets in Windows, which it did not.
SP2 rolled up a large number of fixes intended tighten the security of Windows XP, there was no explanation provided other than it was a security enhancement.
Ultimately the raw sockets thing had no real world impact on network attacks, the type of attacks raw sockets allowed were largely supplanted by then by simpler attacks in a distributed manner, as bandwidth allowed more of a brute force approach.
No idea, but linux has had raw sockets capabilities for its entire existence and the internet continues to exist.
People have the ability to put arbitrary packets out on their network interfaces. The fact that some operating systems don't directly assist you in doing this is not a security feature.
Edit: (and the fact that Gibson doesn't understand this is one reason people think he's a kook)
If that were the only argument, then yes, at the time it would have been a pretty weak refutation. But even at the time, it was obvious to anyone who really understands security that relying on the bad actor's own OS to prevent them from launching attacks is a losing strategy. This was an overblown non-issue. The linux comparison is simply another demonstration of it.
I think I've heard of this before but never actually used it. It is interesting because it misses the ports I have that are wide open, maybe it doesn't scan all ports. Also, I run some IP block whitelists for ports under 1024. So they must be scanning from outside the regions I allow traffic on those ports from.
I have quite a few ports open on my webserver (22, 8080, 80, 443, 444, 81), but the thing is I use them regularly for SSH, serving a website, etc. Is it bad that they are getting picked up by this service?
If you know that they're open, that's different from thinking you don't have any open ports when you do, which is probably what this is supposed to tell you about.
> Is it bad that they are getting picked up by this service?
Only if you are relying on no one knowing they're open as a "security feature" (a.k.a. "security by obscurity").
If you aren't worried about anyone "finding" them because you've taken the time to secure the services then there's absolutely no reason at all to care that this exists.
I had Caddy server running and port 80 was open, but the scanner also said port 53 was open. After quitting Caddy apparently both ports are now closed. Any reason for Caddy to open port 53?
53 is DNS, so it is likely listening on port 53. Perhaps it runs a local DNS cache, but its firewall allows outside traffic. It might even try to provide DNS for ACME DCV.
Yes I think I may have opted to include the cloudflare-dns plugin when downloading Caddy. Perhaps it opens port 53 even though it isn't actually being used? Or perhaps Caddy's automatic HTTPS provisioning logic does this...
Bonus challenge: Use WebRTC to find the internal IP of the connecting device, then use DNS rebinding to port scan the device itself and report those open ports :)
I'm a little confused on my results. I'm getting an actual website with data on it, not a 404. However, in one of the data boxes on the site, I get "HTTP/1.1 404 Not Found".
Specifically, on the right side of the screen there are two boxes. The top one is titled "Open Ports", and lists a single port in it, 7547.
But the box below that is titled "// 7547 / TCP /". In that box is the text "HTTP/1.1 404 Not Found".
So..., am I leaking port 7547, or not? (Fwiw, http://portscan.me/ doesn't find any leaked ports, and sees my host IP as offline).
The service running on that port is for remote management by your ISP [0]. Just like damn near everything else nowadays, it uses HTTP and the 404 status is being returned by the web server running on that port.
Certain ISP-supplied routers listen on port 7547. It's used by your ISP to access your router remotely to perform software upgrades and the like.
Of course, that's just what they claim. Your level of paranoia that it will get hacked and trust in your ISP to not use it to spy on you or override your own configuration is up to you.
My ISP was "nice" enough to supply a router that could be flashed with DD-WRT (it's fiber (to the home) so it's just a box on the wall straight into the router, so whats actually running on the router is under my control).
Wow it tells me how many people are on my minecraft server. I couldn't even find that easily (didn't look too close yet). Anyway, a bit worrying because I leave the whitelist off when my sons friends want to join.
I believe Netcraft still produce this sort of report on a monthly basis and use it to record (among other things) the relative popularity of webservers. At least, they did when I worked there many moons ago; and they monetised such technology. Biggest hurdle isn't the software or execution time: rather, it's making an agreement with your ISP/hosting company wherein they allow you to portscan the entire internet on a regular basis without flagging it as abuse.
Similarly to Shodan, this is looking things up in Censys's database. Censys scans more ports than Shodan, and has more up to date data. Also, I work for Censys.
Shodan is an amazing tool. Missing the 10th anniversary lifetime premium account when they were available for $1 is going to be the biggest missed opportunity of having been dirt poor.
It just yesterday helped me visualize the importance of not exposing things to the internet and might have prevented serious future intrusions.
Does it require login to force an update? I have a dynamic IP and it shows a open port (possible router management) of another user updated couple of weeks back. I understand that the cached results provide faster results, but does it make sense to cache results for dynamic IPs?
For me it redirected to the wrong address, x.x.x.7 instead of x.x.x.1. At x.x.x.7 it got a 404; at the correct address it identified my ISP and said I have port 179 open which appears to be correct. I'm currently trying to find out why port 179 (BGP?) is open.
WireGuard default UDP port is open on my home router. This website says 404. Why? Because the scans don't scan full UDP port range, is my guess. I trust this with regards to TCP.
“One design goal of WireGuard is to avoid storing any state prior to authentication and to not send any responses to unauthenticated packets. With no state stored for unauthenticated packets, and with no response generated, WireGuard is invisible to illegitimate peers and network scanners.” https://www.wireguard.com/papers/wireguard.pdf
Thanks for confirming. It seems to only work on my local network (ie not on my phone if I disconnect from wifi), so I think all is well. Not entirely sure what shodan was seeing, but I suspect it's IP rotation (I haven't had this IP for too long).
I’m not sure how this works but there is something odd about it.
If I connect from a VPN it shows results from my public IP, not the VPN server’s public IP. It does this on multiple VPNs, despite my traffic going over the VPN. Is it caching some device identifier?
Edit: weird, if I connect the VPN then open a private browser tab it then checks the VPN IP. A simple refresh of the tab once VPN is up doesn’t work.
Edit 2: I’m an idiot. I just noticed the page URL.
Can anyone tell me why server_tokens would be on by default in Nginx? Why would it be a standard default practice to disclose what version your goddamn web server is?
By itself, disclosing version information provides little to no security consequence. If you are using an outdated, vulnerable server version, you will be exploitable regardless of whether you present a version number in the vast majority of cases. Attackers don't care whether you present a specific version number before attempting exploits in most cases (unless the exploit has a risk of crashing the service). And if you do have an exploit which depends on a specific version, most likely you can figure out the version without a version number anyway. Hiding version numbers probably does more work to hurt defenders (who want to easily scan and identify outdated software without attempting exploits).
It's always occurred to me that you'd use evolving version data from an aggregator like shodan to build a picture of how up-to-date people keep their software, that way when a new vulnerability hits you have a prioritised list of IPs that haven't updated in a timely manner in the past, rather than wasting cycles trying to exploit auto-updating hosts
The cost of any additional untargeted attack attempt is essentially zero in most cases. It doesn't matter whether you are trying your exploit on 100 hosts or 1 million. An attacker willing to spray exploits across the internet has basically zero incentive to only use those exploits on hosts they know to be running a specific version, and every incentive to just try it out on all hosts running the software that they can possibly identify.
I suppose that's true. It's hard to think in terms of an attacker essentially having unlimited resources, but of course all the resources they're using are already hacked/stolen.
We don't have to consider anything near unlimited resources here - you can do a masscan of the internet on commodity hardware in an hour, or you have a shodan sub (they've sold lifetime basic subscriptions before for $5). Actually doing the exploitation on every target again probably takes under an hour with a couple cheap droplets. The only thing that actually requires any effort is setting up a reliable C&C infra.
One of my favorite games and I picked the name as an homage to my favorite game where you play a hacker. And to be fair, I didn't think that Shodan would become as big as it has when I first launched the website.
Actually, ideally you wouldn't even see a 404 page. Your web browser should just time out when attempting to find the url.
If you are seeing a 404 page, that means that a web browser is listening on port 80 and received the request and is responding with a 404 page. That's not a good thing!
The site returns various HTTP error codes based on the results of that lookup, or shows a fancier page with open ports and other information it has on that IP address. (Example: https://beta.shodan.io/host/1.1.1.1)
There is no active scan occurring here. (But you could be hinting to Shodan that these particular IPs are valid though!)