HTTP and Apache Web Server

HyperText Transfer Protocol or HTTP is the protocol used by the World Wide Web and defined by RFC 2616. It specify message formatting and transmission with actions Web servers and browsers should take in response to various commands.

An HTTP session is a sequence of network request-response transactions. An HTTP client or user sends a request by establishing a Transmission Control Protocol (TCP) connection to a particular port on a server (port 80). An HTTP server listening on that port waits for a client’s request message. Upon receiving the request, the server sends back a status line, such as “HTTP/1.1 200 OK”, and a message of its own. The body of this message is typically the requested resource, although an error message or other information may also be returned.

HTTP defines several commands and responses and the most frequent the HTTP GET request with the filename, is sent from client to get a file from a web server. Server confirms by sending an HTTP GET response with a return code of 200 (meaning “OK”) and the file’s contents. HTML specifies Web pages formatting and display. HTTP is a stateless protocol. A stateless protocol does not require the HTTP server to retain information or status about each user for the duration of multiple requests. However, some web applications implement states or server side sessions using one or more of the following methods

  • HTTP cookies.
  • Query string parameters, for example, /index.php?session_id=some_unique_session_code.
  • Hidden variables within web forms.

HTTPS

Hypertext Transfer Protocol Secure (HTTPS) is used for secure communication on Internet. It is layering addition of the HTTP on top of the SSL/TLS protocol, thus adding the security capabilities of SSL/TLS to standard HTTP communications. HTTPS provides authentication of the web site and associated web server communicating with. It provides bidirectional encryption of communications between a client and server. HTTPS encrypts the HTTP protocol including the request URL, query parameters, headers, and cookies.

Universal Resource Locators

A URL is the unique address for a file that is accessible on the Internet. Any file on world wide web has a URL specified like HTML page, image file, or a program like common gateway interface (CGI) application or Java applet. The URL contains the name of the protocol to be used to access the file resource, a domain name that identifies a specific computer on the Internet, and a pathname, a hierarchical description that specifies the location of a file in that computer.

Every URL consists of some of, the scheme name (commonly called protocol), followed by a colon, two slashes, then, depending on scheme, a server name (exp. ftp., www., smtp., etc.) followed by a dot (.) then a domain name </ref> |group=”note”}} (alternatively, IP address), a port number, the path of the resource to be fetched or the program to be run, then, for programs such as Common Gateway Interface (CGI) scripts, a query string, and an optional fragment identifier. The syntax isas, scheme://domain:port/path?query_string#fragment_id

The browser identify a web page or something on a web page or by a Universal Resource Locator (URL) or a web address in the browser’s address area. The end node uses DNS to discover the IP address corresponding to requested hostname or URLs, which may even include the IP address of the web server and the steps followed are

  • The user enters the URL in the browser.
  • Client’s PC sends a DNS request to DNS server (learnt either by DHCP or configured)
  • DNS server replies with IP address.
  • Client’s PC establishes a new TCP connection to the web server and data transfer starts.

HTTP (Hypertext Transfer Protocol) server, or a web server, is a network service that serves content to a client over the web. This typically means web pages, but any other documents can be served as well. The configuration files used by Apache web server are –

  • /etc/httpd/conf/httpd.conf – The main configuration file.
  • /etc/httpd/conf.d/ – An auxiliary directory for configuration files that are included in the main configuration file.

The following directives are commonly used in the /etc/httpd/conf/httpd.conf configuration file:

<Directory> – The <Directory> directive allows you to apply certain directives to a particular directory only. It takes the following form:

<Directory directory>

directive

</Directory>

The directory can be either a full path to an existing directory in the local file system, or a wildcard expression. This directive can be used to configure additional cgi-bin directories for server-side scripts located outside the directory that is specified by ScriptAlias. In this case, the ExecCGI and AddHandler directives must be supplied, and the permissions on the target directory must be set correctly (that is, 0755). Example –

<Directory /var/www/html>

Options Indexes FollowSymLinks

AllowOverride None

Order allow,deny

Allow from all

</Directory>

<IfDefine> – The IfDefine directive allows you to use certain directives only when a particular parameter is supplied on the command line. It takes the following form

<IfDefine [!]parameter>

directive

</IfDefine>

The parameter can be supplied at a shell prompt using the –D parameter command line option (for example, httpd -DEnableHome). If the optional exclamation mark (that is, !) is present, the enclosed directives are used only when the parameter is not specified. Example –

<IfDefine EnableHome>

UserDir public_html

</IfDefine>

<IfModule> – The <IfModule> directive allows you to use certain directive only when a particular module is loaded. It takes the following form:

<IfModule [!]module>

directive

</IfModule>

The module can be identified either by its name, or by the file name. If the optional exclamation mark (that is, !) is present, the enclosed directives are used only when the module is not loaded. Example –

<IfModule mod_disk_cache.c>

CacheEnable disk /

CacheRoot /var/cache/mod_proxy

</IfModule>

<Location> – The <Location> directive allows you to apply certain directives to a particular URL only. It takes the following form:

<Location url>

directive

</Location>

The url can be either a path relative to the directory specified by the DocumentRoot directive (for example, /server-info), or an external URL such as http://example.com/server-info. Example –

<Location /server-info>

SetHandler server-info

Order deny,allow

Deny from all

Allow from .example.com

</Location>

<Proxy> – The <Proxy> directive allows you to apply certain directives to the proxy server only. It takes the following form:

<Proxy pattern>

directive

</Proxy>

The pattern can be an external URL, or a wildcard expression (for example, http://example.com/*). Example –

<Proxy *>

Order deny,allow

Deny from all

Allow from .example.com

</Proxy>

<VirtualHost> – The <VirtualHost> directive allows you apply certain directives to particular virtual hosts only. It takes the following form:

<VirtualHost address[:port]…>

directive

</VirtualHost>

The address can be an IP address, a fully qualified domain name, or a special form.

AccessFileName – The AccessFileName directive allows you to specify the file to be used to customize access control information for each directory. It takes the following form:

AccessFileName filename…

The filename is a name of the file to look for in the requested directory. By default, the server

looks for .htaccess.

Action – The Action directive allows you to specify a CGI script to be executed when a certain media type is requested.

ServerRoot – This is used for specifying the base directory for the web server. On Fedora, RHEL, and Centos distributions, this value, by default, is the /etc/httpd/ directory. The default

value for this directive in Ubuntu, OpenSuSE, and Debian Linux distributions is /etc/apache2/ .

Listen – This is the port(s) on which the server listens for connection requests. It can also be used to specify the particular IP addresses over which the web server accepts connections. The default value for this directive is 80 for nonsecure web communications.

Listen [IP-address:] portnumber

ServerName – This directive defines the hostname and port that the server uses to identify itself. At many sites, servers fulfill multiple purposes. An intranet web server that isn’t getting heavy usage, for example, should probably share its usage allowance with another service. In such a situation, a computer name such as “www” (fully qualified domain name, or FQDN=www.example.org) wouldn’t be a good choice, because it suggests that the machine has only one purpose.

ServerAdmin – This is the e-mail address that the server includes in error messages sent to the client.

DocumentRoot – This defines the primary directory on the web server from which HTML files will be served to requesting clients. On Fedora distros and other Red Hat–like systems, the default value for this directive is /var/www/html/. On OpenSuSE and SEL distributions, the default value for this directive is /srv/www/htdocs .

MaxClients – This sets a limit on the number of simultaneous requests that the web server will service.

LoadModule – This is used for loading or adding other modules into Apache’s running configuration. It adds the specified module to the list of active modules.

LoadModule module filename

User – This specifies the user ID the web server will answer requests as. The server process will initially start off as the root user, but will later downgrade its privileges to those of the user specified here. The user should only have just enough privileges to access files and directories that are intended to be visible to the outside world via the web server. Also, the user should not be able to execute code that is not HTTP- or web-related.

Group – This specifies the group name of the Apache HTTP server process. It is the group with which the server will respond to requests. The default value under the Fedora and RHEL flavors of Linux is “apache.” In OpenSuSE Linux, the value is set to the group “www.” In Ubuntu, the default value is “www-data.”

Include – This directive allows Apache to specify and include other configuration files at runtime. It is mostly useful for organization purposes; you can, for example, elect to store all the configuration directives for different virtual domains in appropriately named files, and Apache will automatically know to include them at runtime.

Include file_name_to_include_OR_path_to_directory_to_include_

UserDir – This directive defines the subdirectory within each user’s home directory, where users can place personal content that they want to make accessible via the web server. This directory is usually named public_html and is usually stored under each user’s home directory. This option is, of course, dependent on the availability of the mod_userdirmodule in the web server setup.

A sample usage of this option in the httpd.conf file is – UserDir public_html

ErrorLog – This defines the location where errors from the web server will be logged to.

LogLevel – This option sets the level of verbosity for the messages sent to the error logs. Acceptable log levels are emerg, alert, crit, error, warn, notice, info, and debug. The default log level is “warn.”

Alias – The Alias directive allows documents (web content) to be stored in any other location on the file system that is different from the location specified by the DocumentRootdirective. It also allows you to create abbreviations (or aliases) for path names that might otherwise be quite long.

ScriptAlias – The ScriptAlias option specifies a target directory or file as containing CGI scripts that are meant to be processed by the CGI module ( mod_cgi).

ScriptAlias URL-path actual_file-path_OR_directory-path

VirtualHost – One of the most-used features of Apache is its ability to support virtual hosts. This makes it possible for a single web server to host multiple web sites as if each site had its own dedicated hardware. It works by allowing the web server to provide different, autonomous content, based on the hostname, port number, or IP address that is being requested by the client. This is accomplished by the HTTP 1.1 protocol, which specifies the desired site in the HTTP header rather than relying on the server to learn what site to fetch from its IP address.

Ports

The default port for HTTP requests is port 80, but you can also configure a web server to use a different (arbitrarily chosen) port that is not in use by another service. This allows sites to run multiple web servers on the same host, with each server on a different port. Some sites use this arrangement for multiple configurations of their web servers to support various types of client requests.

Running a web server on a Linux/UNIX platform forces you to be more aware of the traditional Linux/UNIX permissions and ownership model. In terms of permis-sions, that means each process has an owner and that owner has limited rights on the system.

Apache Modules

Apache is so powerful and flexible is that its design allows extensions through modules. Apache comes with many modules by default and automatically includes them in the default installation. Some common Apache modules are

  • mod_cgi Allows the execution of CGI scripts on the web server
  • mod_perl Used to incorporate a Perl interpreter into the Apache web server
  • mod_aspdotnet Provides an ASP.NET host interface to Microsoft’s ASP.NET engine
  • mod_authz_ldap Provides support for authenticating users of the Apache HTTP server against a Lightweight Directory Access Protocol (LDAP) database
  • mod_ssl Provides strong cryptography for the Apache web server via the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols
  • mod_ftpd Allows Apache to accept FTP connections
  • mod_userdir Allows user content to be served from user-specific directories on the web server via HTTP

Apply for Linux Administration Certification Now!!

http://www.vskills.in/certification/Certified-Linux-Administrator

Share this post
[social_warfare]
FTP
SMTP, POP and IMAP

Get industry recognized certification – Contact us

keyboard_arrow_up