Senin, 17 Maret 2014

Search Engines

Function of search engine 


1. Crawling the internet for web content
2. Indexing the web content
3. Storing the website content
4. Search algorithms and result


  • for web content. 2. Indexing the web content. 3. Storing the website contents.
  • for web content. 2. Indexing the web content. 3. Storing the website contents.

Essential Vocabulary for the Internet



  • browser  To browse through a page, exploring what's there and seeing where the links take you, is a bit like window shopping. When you browse, you have to guess which words and links on the page pertain to your interests. The opposite of browsing is searching.

  • frame  Frames are a technique used in web pages to divide the page into multiple windows, where each window is called a frame and can contain its own separate page. The advantage of frames is that one window can be scrolled or changed while other windows remain fixed for such purposes as keeping a menu in view all the time. The disadvantage is that not all browsers support them.

  • server A server is a computer designed to provide various services for an entire network. It is typically either a workstation or a mainframe because it will usually be expected to handle far greater loads than ordinary desktop systems. The load placed on servers also necessitates that they utilize robust OSes, as a crash on a system that is currently being used by many people is far worse than a crash on a system that is only being used by one person.
  • plug-in A plug-in is a piece of software designed not to run on its own but rather work in cooperation with a separate application to increase that application's abilities.
  • applet An application that is downloaded from a web page and executed by browser software. Also, an HTML tag that defines an applet program.
  • cookie A cookie is a short file put on your system by a web page which includes information about your usage and facilitates the current interaction. For example, it may include the information that you have logged into a passworded area already in the current session and don't need a second password check. There are many uses for cookies, they may be erased at the end of a session or retained until the next session, and they may be encrypted or in plain text. For a more thorough explanation of cookies, see the cookie section of our article on Privacy and the Cookie FAQ at cookiecentral.com.
  • Telnet Common Gateway Interface. A method used by WWW pages to communicate with programs run on the web server.
  • netiquette The etiquette on the Internet.
  • navigation button A set of buttons or graphic images typically in a row or column used as a central point to link the user to major topic sections on a Web site. The navigation bar may be a single graphic image with multiple selections.
  • hypertext Generally, any text that contains links to other documents - words or phrases in the document that can be chosen by a reader and which cause another document to be retrieved and displayed.
  • pull-down menu Pull-down menus are the type commonly used in menu bars (usually near the top of a window or screen), which are most often used for performing actions, whereas pop-up (or "fly-out") menus are more likely to be used for setting a value, and might appear anywhere in a window.
  • pop-up window A window that automatically loads for viewing without being selected by the user.
  • search engine A (usually web-based) system for searching the information available on the Web.
    Some search engines work by automatically searching the contents of other systems and creating a database of the results. Other search engines contains only material manually approved for inclusion in a database, and some combine the two approaches.
  • domain name Domain name addresses, together with IP addresses, are the two forms of Internet addresses in common use. Domain name addresses all end with a correct top-level domain. The top-level domains may be any of these:

    • com
    • edu
    • gov
    • int
    • mil
    • net
    • org
    • a two-letter country code, such as us, uk, or mx. See the country code table.
    The Internet Corporation for Assigned Names and Numbers (ICANN) announced a new series of top level domains available for registration with more to come. They are:
    • aero
    • asia
    • biz
    • cat
    • coop
    • info
    • jobs
    • mobi
    • museum
    • name
    • post
    • pro
    • tel
    • travel
    A complete domain address adds one or more terms to the left of the top-level domain, separated by dots. The top-level domain at the right is the most general; each term to the left is more specific.
  • spam An inappropriate attempt to use a mailing list, or USENET or other networked communications facility as if it was a broadcast medium (which it is not) by sending the same message to a large number of people who didn?t ask for it. The term probably comes from a famous Monty Python skit which featured the word spam repeated over and over. The term may also have come from someone?s low opinion of the food product with the same name, which is generally perceived as a generic content-free waste of resources. (Spam® is a registered trademark of Hormel Corporation, for its processed meat product.)
  • WWW (World Wide Web)World Wide Web (or simply Web for short) is a term frequently used (incorrectly) when referring to "The Internet", WWW has two major meanings:First, loosely used: the whole constellation of resources that can be accessed using Gopher, FTP, HTTP,telnet, USENET, WAIS and some other tools.Second, the universe of hypertext servers (HTTP servers), more commonly called "web servers", which are the servers that serve web pages to web browser
  • HTML HyperText Markup Language. The coding system used to create WWW pages. A page written in HTML is a text file that includes tags in angle brackets that control the fonts and type sizes, insertion of graphics, layout of tables and frames, paragraphing, calls to short runnable programs, and hypertext links to other pages. Files written in HTML generally use an .html or .htm extension.
  • HTTP HyperText Transfer Protocol. It is the main protocol used on the World Wide Web that enables linking to other web sites. Addressing to other web pages begins with "http://" and is followed by the domain name or IP address.
  • URL Uniform Resource Locator. URLs specify the location of a resource in the Internet.You can type or paste a URL into the Location window in your browser and then connect to it. The URL shows the type of item and its basic address and path. The major types are http, gopher, ftp, telnet, newsgroups, news articles, and files, which may be programs, text, graphics, audio, video, etc.
  • FTP File Transfer Protocol. The Internet protocol that permits you to transfer files between your system and another system. 
  • ISP Internet Service Provider.
  • TCP/IP This is the suite of protocols that defines the Internet. Originally designed for the UNIX operating system, TCP/IP software is now included with every major kind of computer operating system. To be truly on theInternet, your computer must have TCP/IP software.
  • BBS Bulletin Board System)
    A computerized meeting and announcement system that allows people to carry on discussions, upload and download files, and make announcements without the people being connected to the computer at the same time. In the early 1990's there were many thousands (millions?) of BBS's around the world, most were very small, running on a single IBM clone PC with 1 or 2 phone lines. Some were very large and the line between a BBS and a system like AOL gets crossed at some point, but it is not clearly drawn.
  • LAN (Local Area Network)A computer network limited to the immediate area, usually the same building or floor of a building.

  • WAN (Wide Area Network)Any internet or network that covers an area larger than a single building or campus.
  • PDF Adobe's Portable Document Format. It is often used as a format which allows much more complete, controlled layout of a page and its graphics and text than conventional HTML does. It requires a browser plug-in to see a web page in PDF format. Files will usually have a .pdf extension.
    To create a page in PDF format, you need Adobe Acrobat (not the free Acrobat Reader) or other premium Adobe software.
  • GIF Graphical Interchange Format. A bitmap graphical format originally developed     for CompuServe that is widely used in WWW pages. It is particularly good for text art, cartoon art, poster art, and line drawings- -all types with solid colors and distinct lines or borders between different colors. GIF files use a .gif extension.
  • JPEG  (Joint Photographic Experts Group)JPEG is most commonly mentioned as a format for image files. JPEG format is preferred to the GIF format for photographic images as opposed to line            art or simple logo art.
  • MIDI Musical Instrument Digital Interface.
  • CGI Common Gateway Interface. A method used by WWW pages to communicate with programs run on the web server.
  • IRC Internet Relay Chat. An Internet protocol that allows people all over the world to meet in conference groups (called channels) and chat with each other by typing.