Capturing social ne...
 
Notifications
Clear all

Capturing social networking websites

16 Posts
13 Users
0 Reactions
1,848 Views
4n6art
(@4n6art)
Reputable Member
Joined: 18 years ago
Posts: 208
Topic starter  

I've been asked to capture a MySpace, Facebook and LinkedIn website for a client. I have the usernames/passwords of the people whose websites I need to capture.

Questions
- Will website capture programs like HTTrack, WebSnake, Wget etc work with social networking sites and get me everything I want?

- If not, aside from PDFing each page, how are others capturing these sites?

Any and all suggestions appreciated.
Thank you folks!
-=Art=-


   
Quote
(@ctendell)
Trusted Member
Joined: 16 years ago
Posts: 62
 

Oddly enough, I just recently completed a case that required I capture websites in a similar manner. However, it was from the way back machine.
And assuming that the social sites aren't using a restricted robots.txt you should be fine.

I had a great deal success with using Httrack, Wget and one other application that at this very moment escapes me.

The problem with these tools is it makes it difficult to produce. Saving the full page with hyperlinks and all is great but, and I assume you're dealing with attorneys, we still live in a "print it all" world.

That being said you may be better off just PDFing. If that's not an issue, using Httrack or a similar tool should work just fine.


   
ReplyQuote
(@kovar)
Prominent Member
Joined: 18 years ago
Posts: 805
 

Greetings,

Possibly obvious suggestion, but make certain that you have permission from the owners of the accounts to access the accounts. Otherwise, you risk legal trouble of your own.

I agree with ctendell - if in doubt, PDF it all. Acrobat Pro does this nicely.

-David


   
ReplyQuote
bshavers
(@bshavers)
Estimable Member
Joined: 20 years ago
Posts: 211
 

Consider running a video capture as you go through the sites printing and/or PDF'ing pages.

PDF/Printing won't capture expanding menus, audio or video that may be playing. You can click every link, see every video, and end up with a video that you can print screenshots from. The video capture may also reduce the risk of claims that you planted anything on the accounts or saw something you should not have (if anything is off limits to your consent to view and capture).


   
ReplyQuote
keydet89
(@keydet89)
Famed Member
Joined: 21 years ago
Posts: 3568
 

Questions
- Will website capture programs like HTTrack, WebSnake, Wget etc work with social networking sites and get me everything I want?

What would happen if you tried it on a test account first, and then looked to see if it captured everything you wanted?


   
ReplyQuote
Wardy
(@wardy)
Estimable Member
Joined: 20 years ago
Posts: 149
 

Keydet has hit the nail on the head. Create dummy accounts and populate them. Then test and verify. That way you'll be certain of what is/isn't captured.


   
ReplyQuote
(@tsweet)
Active Member
Joined: 17 years ago
Posts: 7
 

What would happen if you tried it on a test account first, and then looked to see if it captured everything you wanted?

This is the best approach in my opinion. It really helps your evidence stand up if you can show a process more than once and have repeatable results.


   
ReplyQuote
(@douglasbrush)
Prominent Member
Joined: 16 years ago
Posts: 812
 

Why not do it all?

Capture and download so you have the code. PDF through browser of pages so you a static view of what will show. Use CamSudio (http//camstudio.org/) to capture everything & document findings in your notes.

Civil or criminal by the way? I agree with David that you have to review legal ramifications of your actions as well.

You may want to throw out preservation requests to the social media hosts as an act of due diligence if you anticipate any further actions down the road. Each site has a legal department and a method for submitting the requests. If it is not criminal good luck, but at least you covered all your bases.


   
ReplyQuote
bshavers
(@bshavers)
Estimable Member
Joined: 20 years ago
Posts: 211
 

Social networking sites are extremely dynamic and changing every time any person updates their status or posts to someone's wall. The owner doesn't really have that much control over all the content. So although you can conduct a repeatable process, the results can be drastically different 60 seconds later. If there are 100 'friends' on the FaceBook site, that is 100 different people that can add or subtract content to your client's (or your client's client) accounts by tagging photos, writing or deleting comments on their 'wall', etc…

When you log into the sites (at least with FaceBook and MySpace, not so much LinkedIn), you (as in the account you logged into…) will probably be seen as "Online" by all the friends. Given some people's uncontrollable needs to send messages as soon as their friends log in, you may see real-time chats from these friends, intended to be seen by the owner of the account. These won't be captured by a printout of the page, but will be captured by screen/video captures. I'd suggest that as soon as you log in, change the configuration to show 'offline' to prevent inadvertent and private messages being seen and/or captured.

Also, when logging into FaceBook from your machine, 'your' RAM may fill with chats from that FaceBook account from days or weeks prior. If you grab and exam your RAM, you'll probably have prior chats that are not showing on the FaceBook website…I'm not totally sure why that data populates your RAM if it is not accessible to the account user, or maybe it is through an option that I am unaware.

And be aware that some code on the FaceBook site will gather some of your machine's information, especially if you click on any survey. Lots of other nasty stuff can be there too from exploits running as soon as you log in. Use a clean machine and clean it up when you are done, or use a new virtual machine that has not visited any websites so you get a clean capture, uninfluenced by prior visits to other social networking sites. VMware Workstation has a built-in video capture feature too.

Given the popularity of these sites and that IP is running out the front and back door with ignorant postings of internal corporate/opsec data through these sites (not to mention other criminal acts like drugs, terrorism, and ID theft), capturing these sites will more than likely become the norm, even capturing multiple occasions because of its dynamic nature. I would foresee video captures being the primary method of capturing everything only because of the dynamic data on the sites that can't be captured otherwise. Specific screenshots from the video can show (on paper) some of the dynamic information. With a very popular person, you could also have so many external links, friends, postings, surveys, groups, and anything else s/he joined, that your video capture will be going all over the place chasing rabbits down different trails where simply PDF'ing wont do your collection justice.

And these are only SOME of the reasons I don't have, nor like FaceBook and MySpace.


   
ReplyQuote
4n6art
(@4n6art)
Reputable Member
Joined: 18 years ago
Posts: 208
Topic starter  

Thank you all for the great responses.

To answer a few questions
- This is a civil case. The opposing Counsel has given the usernames and passwords to Counsel and they know that we will be capturing the site. This is not a case where we have found the usernam/pwd on the target machine and are trying to backdoor (for lack of a better word) into it.

- I just got the request yesterday - I am going to test it like Harlan and others suggest - just hadn't gotten around to it.

- The idea of using a clean system going offline as soon as I log in is well taken - thank you )

Looks like a combination of spidering, PDFing and possibly video recording the screen will be what's called for.

@CTENDELL I've had to do a archive.org (wayback machine) capture before. Apparently, using the automated capture programs didn't work too well because Archive uses some kind of a javascript to populate those pages with the old data. I never got good results and research found that others had the same issue too - FYI.

Thanks again, all!
-=Art=-


   
ReplyQuote
Page 1 / 2
Share: