Notifications
Clear all

examining a site

15 Posts
6 Users
0 Reactions
1,024 Views
(@johnr)
Eminent Member
Joined: 16 years ago
Posts: 25
Topic starter  

Hey guys,

I'm stuck on finding a guideline on how to go about examining a website. Anyone with any pointers would be greatful for.

Cheers,


   
Quote
(@kovar)
Prominent Member
Joined: 18 years ago
Posts: 805
 

Greetings,

What is the purpose of examining the web site? What sort of website? What tools were used to create the website?

Usually when this sort of thing comes up, I suggest using wget/httrack to get a copy of the site, Acrobat Pro to turn a site into PDFs, and a streaming video capture tool to "film" a user navigating the site if it uses a lot of javascript or something similar which doesn't lend itself to either of the other approaches.

-David


   
ReplyQuote
(@johnr)
Eminent Member
Joined: 16 years ago
Posts: 25
Topic starter  

Just from a general point of view really. Wanting to know what each one would do. I've looked into telnet to obtain information without giving too much away but it hasn't returned much


   
ReplyQuote
(@kovar)
Prominent Member
Joined: 18 years ago
Posts: 805
 

Greetings,

Using telnet to examine a website seems to be an odd, and somewhat aggressive, approach.

1) Open the page source for each page you're interested in to see how it was built.
2) Use Acrobat Pro to turn the entire site into PDFs for preservation purposes.
3) Use wget or httrack to copy the entire site to your local system for examination.
4) Use a video capture tool to record how the site works as a user navigates through it.

_David


   
ReplyQuote
(@johnr)
Eminent Member
Joined: 16 years ago
Posts: 25
Topic starter  

How would you go about detecting a hidden page?


   
ReplyQuote
jhup
 jhup
(@jhup)
Noble Member
Joined: 16 years ago
Posts: 1442
 

Using telnet would not reveal links in flash or other plugins.

This sounds like you are not examining a site, i.e. you have access to the site, logs, etc, but you are doing some covert research on it.

Is this correct?

You have no way of detecting a truly "hidden" page.

Think about it. For example the page can be in a DB that requires a very unique URL construct, maybe time frame, maybe mangled browser name, IP range, moon phase, etc. And, that is just DB based solution. It could be combined with URL rewrites, additional codes, field validation, and so on.


   
ReplyQuote
(@johnr)
Eminent Member
Joined: 16 years ago
Posts: 25
Topic starter  

its more a general questions rather than I have this scenario. I know robots.txt stops certain pages from been spidered but apart from that I wouldn't know once I've stumbled accross a site which i consider of interest of how to conduct such an investigation without logs. Yes covert would be great place, but how. There's little information on google that contains these answers.


   
ReplyQuote
(@kovar)
Prominent Member
Joined: 18 years ago
Posts: 805
 

Greetings,

John, you're being a bit evasive and you're not providing us much information that we can use to assist you. Speaking for myself, this causes me to wonder about your motives and until I'm more comfortable with your motives I'm reluctant to provide you with any more assistance.

-David


   
ReplyQuote
(@johnr)
Eminent Member
Joined: 16 years ago
Posts: 25
Topic starter  

Ok, sorry I'm doing an assesment for my university degree, and therefore trying to keep it vague. we've studied tools such has telnet to communicate with servers without giving too much away to the server about the client. So I wondered if this played a vital role in investigating web sites. I'm guessing not. So I then moved onto how to detect hidden pages and trying to answer that. I've found an article that discusses robot.txt, but apart from that I'm struggling to find if there's another way to detect if a page/pages/contents have been hidden from the normal public domain.

I hope this restores faith in my name, and my posts.


   
ReplyQuote
(@darksyn)
Trusted Member
Joined: 17 years ago
Posts: 50
 

Okay, johnR, couple of things here… You need to first of all decide whether you're going to be doing an overt or covert investigation, and then decide if you're going to be looking at normal (X)HTML pages or start playing around with php, asp etc pages.

In the first class (client-side), yes, you are heading in the right direction by telnetting (I would use netcat too), but you might want to read up on exactly what commands can be sent to the server once you telnet to it, and the use of netcat (read netcat's manual page and look at the flags in there as well) is most highly recommended too.

Note For the rest of you, telnetting to a web server and looking at the page through it is a quaint and kind of geeky but PERFECTLY legal (and not covert as such) way of looking at a page. How do you guys think the client sends commands to the server, exactly?

With regards to the robots.txt file, I am sure your research has indicated that robots.txt is essentially an "agreed upon" (but not universally enforced, kind of "we are good boys and girls and listen to our elders") way to limit access to robots/spiders, but there are a lot of userland apps that don't conform to it. You, in your research (or your software solution, depending on your assignment), don't need to comply to it. The bad people certainly don't and won't.

There are different ways of investigating different types of pages, some of which are greatly dependent on client versus server-side scripting. Looking into that alone will give you some interesting assignment topics.

This I got to ask Are you just discussing this issue in your assignment or developing a software solution as well?

If you wish to expand on it (and don't worry, that does not constitute collusion or plagiarism, so you should be okay) after you have decided on specific things you wish to look at, we could give you some more info.

And, by the way, everything the site shows or fails to show is of interest to someone conducting an investigation. From server response times to web page-related information, to the existence of phpinfo, to the type and style of error messages you're getting for eg. response 404. Investigating a web page truly is more than just looking at what the page itself contains, especially if you're looking for hidden things.

Cheers
DarkSYN

PS Just thought to say that you should actually sit down and set up a web server, johnR, in your home computer, and look at how things work. IIS is part of WinXP Pro, but you should also install an Apache + PHP + MySQL server and toy around with it too, not just to see how web servers work but also what goes on between http clients and http servers.


   
ReplyQuote
Page 1 / 2
Share: