Security aspects of data on a Web Server, managed by a third party hosting service?
Every now and then we see advertisements about free web space and domain names. And at some point or the other, we have thrown caution to the whim and registered on one of these sites and built up a web page. When we go about dumping our files on these online servers, there are some factors that should be considered.
I don't mean to say that Free web spaces are insecure or unsafe, but ..... they can be, if not used carefully. Besides, it doesn't matter if web space is Free or Paid, security should be considered eitherways.
Following article applies more to the cases where you have a Shell account (access using SSH or Telnet or Rlogin) on the server or at least FTP access, so that you can save files on the server. It does not really apply to myspace or orkut or similar websites, although some of the point should be considered even when storing files on these websites.
Security is directly proportional to Paranoia. I am being a little extra paranoid about security here, protecting as much as possible and from whosoever possible, not trusting anyone (human or any process). This is a general set of guidelines that should be considered while securing a web server, however their effectiveness depends largely on the extent upto which they are followed and how well they are implemented.
Who are potential threats to your data?
- Web Users (Humans & Crawlers)
- Users of the Unix machine (where web server is located)
- System Admin's and root users (Super Users or sudoers)
How to Secure data?
- Use File Access Permissions
- Encrypt confidential data
- Use htpasswd to restrict access to files and directories from the web (internet users)
- Restrict Web spiders and crawlers
- Archive data
File Permissions
This is the first and the foremost step towards restricting access to files and directories residing on any unix machine.
Assign general permissions to top level directories (example public_html) like 755 and keep it more restricted as you go down into subdirectories. I would, infact, assign a 700 permission to most of my files (All access to owner, no access to group or others ).
As for the html's, jpeg's and all other static web content (files that form the website) We are forced to assign a 755 permission, in many cases, so that the web server's executable can access them.
It all depends on how the unix machine is set up, or in other words, the contents of "/etc/passwd" and "/etc/groups" and how the Web Server is set up i.e. the httpd.conf and access.conf.
Encrypt confidential data
If you're planning on storing confidential information on your third party web space, Encrypt it. My favorite encryption algorithm is AES. 128 bit encryption is good enough but you could use 256 depending on your level of paranoia.
You can also use Double or Triple AES. Double encryption simple means that you encrypt text using a key, then encrypt the resultant encrypted text again with the same key or a different one and similarly for Triple encryption. It would be more work on your part because you would have to follow the same procedure in reverse order while decrypting.
You could either encrypt stream of text and save it in text files for quick access, so that the file is text but not readable because the content is encrypted, you can use a web based encryption tool that would execute in the browser. You could use a tool like ascrypt or acrypt on Windows or unix that encrypts files altogether.
It does not really matter which encryption algorithm or encryption tool you use (in this very context, not as a general rule), what really matters is HOW you use them. Try to use non-standard methodologies & practices while encrypting so that simple cracking tools cannot break them.
Use htpasswd to restrict access to files and directories from the web (internet users)
htpasswd is a web server module which provides a simple mechanism to password protect directories and files which are being accessed through the web server. This restricts web users from accessing the files. The System Admins can still access the files and so does anyone else who can access the hard drive on which the data resides.
Restrict Web spiders and crawlers
Lately, I've been seeing cases where google searches have shown an awful lot of information about people, more information than those people imagined, or wanted. Files that were stored by some friend of someone at some corner of some unknown server, had started to show up in Google searches. All because Google has Tons of spiders that keep on crawling the web all the time. I am not criticizing crawlers or Google. Trust me, I love google and its these crawlers that make google searches so good and effective and so useful. But a little care should be taken.
Use some mechanism to restrict web spiders to crawl only through certain directories and not through confidential stuff like photos or personal data.
This is very important because if the spiders crawl through all the files and directories of the web server's "document root" it does reveal some of the information (like directory structure) in Google searches.
This can be done using meta tags.
Look at the google screenshot to see what I am talking about.
Although, the information that shows up in google searches may not be accessed by a web user because the directory is restricted by htpasswd or the file is encrypted, but it still compromises the security by revealing directory structure or the names of files etc.
Archive data
If you have directories filled with a whole bunch of files like text, html or even jpegs and the number of files is large its a good idea to tar these files, to say the least or even better compress them. You can also password protect these archives. This ensures no crawling and no access.
It also makes sure that automated scripts (involving grep, sed, awk etc) don't parse through your text files.
As a general rule, archiving is a good way of keeping files sorted and keeps the file system clean. It reduces number of files so if there's indexing of any kind, index would be small and searches would be fast. Of course, the downside would be that archived files will not show up in searches. Also, File transfers are faster and backups are easier.

3 comments:
nice article !!
On 1/8/07, Kalyan Rao wrote:
cool
nice article..... but if you like feedback I can be picky ;)
"You can also use Double or Triple AES"
There is DES, triple DES and AES.... dont know any of TRIPLE AES and dont suggest people to use DES... it uses 56 but keys,its not safe anymore .
I am not happy with the line
"Who are threats to your data?
-Web Users (Human & Crawlers)"
I am more happy with the line " Restrict Web spiders and crawlers"
because we want to use web crawlers to our advantage ... so don't call crawlers as threats, they are threats but don't scare people dude... :)
Aprt from that . all is cool
have fun.
Ka
..................................
Amol wrote:
Double AES and Triple AES is not standard, its my own term for when we do AES twice or thrice. Triple AES is as secure as one can get.
Agree with you about DES. Its not secure, in an experiment someone actually used brute force and broke DES in 56 hours, I dont have the details though.
As for
"Who are threats to your data?
-Web Users (Human & Crawlers)"
Any Human on the web can be a threat. Crawler is not a threat directly, but the humans can use the crawler to their advantage, which makes crawler an indirect threat. I agree with you, I'll try to elaborate the threats section of my article.
I really appreciate your feedback.
Keep criticizing, thats my only way to improve.
Amol
Hi buddy,
You wrote a nice article. Nice info!!
Post a Comment