Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore CU-MCA-SEM-VI-Web optimization

CU-MCA-SEM-VI-Web optimization

Published by Teamlease Edtech Ltd (Amita Chitroda), 2021-11-03 14:10:42

Description: CU-MCA-SEM-VI-Web optimization

Search

Read the Text Version

In Google and other search engines, the results page often features paid ads at the top of the page, followed by the regular results or what search marketers call the \"organic search results\". Traffic that comes via SEO is often referred to as \"organic search traffic\" to differentiate it from traffic that comes through paid search. Paid search is often referred to as search engine marketing (SEM) or pay-per-click (PPC). 5.2 BLACK HAT SEO Black hat SEO refers to a set of practices that are used to increases a site or page’s rank in search engines through means that violate the search engines’ terms of service. The term “black hat” originated in Western movies to distinguish the “bad guys” from the “good guys,” who wore white hats (see white hat SEO). Recently, it’s used more commonly to describe computer hackers, virus creators, and those who perform unethical actions with computers. For Example: Content Automation, Doorway Pages, Hidden Text or Links, Keyword Stuffing, Reporting a Competitor (or Negative SEO), etc. 5.2.1 The Risks of Black Hat SEO: There are significant risks involved with using black hat tactics to rank your website, and that's the reason why most SEOs choose not to consider such approaches. The majority of the SEO industry deems these practices to be completely unethical. But the reality is that there are, and likely always will be, a small percentage of marketers who want to try and cheat the system and try to fast-track their site's organic success. However, even if black hat SEO techniques prove to work for your website, the results are often short-lived. Reasons To Avoid Black Hat SEO: 1. It Can Negatively Impact Your Search Rankings and Visibility : The number one reason not to use black hat SEO tactics is that they will ultimately result in your site losing search rankings, visibility, and traffic. Just take a look at the below figure 5.1. This is the visibility of a site that participated in unnatural tactics and was negatively impacted as a result of this: 51 CU IDOL SELF LEARNING MATERIAL (SLM)

Figure 5.1 When a website loses traffic and visibility, this typically means that conversions and revenue follow a similar trend. This, in itself, can mean a reduction in a business' income and lead to job losses or even business closures. At best, a severe decline in organic traffic will mean that this has to be supplemented with a higher investment into PPC or other paid media. 2. It Won’t Drive Long-Term Results: Even in cases when rankings and organic performance initially increases from manipulative techniques, these are rarely sustained. While it might take some time for Google to determine that a site participates in unethical approaches (this could be that a manual review was conducted or that a core algorithm was updated), once it happens, a loss of traffic is inevitable. Perhaps the only thing worse than struggling to rank a site at all is seeing rankings and traffic artificially inflated, only to drop suddenly in the near future. Businesses need predictability, and that's not something that black hat tactics can deliver. 3. It Typically Results In A Poor User Experience: SEO needs to consider a user's experience on a site and work to serve the best content and the best UX. However, black hat tactics do the exact opposite; they optimize for search engines (at least what they think search engines want to see) rather than users. This, in itself, can be problematic. Trust plays a large part in search success. If primary consideration is given to search engines over users, there's a good chance that the site's ability to convert will be significantly limited. 5.2.2 Black Hat SEO Tactics to Avoid: 1. Keyword Stuffing : Repeating your page's main target keyword(s) excessively won't help you to rank. Keyword stuffing, as it's known, will almost certainly result in the opposite. Black hat SEOs will sometimes attempt to manipulate a site's rankings by including a keyword unnaturally across a page. Keyword stuffing often occurs in random blocks that sit outside of the main content or within paragraphs that just don't make sense when you read them aloud. 2. Automatically Generated/Duplicate Content: Creating great content isn't easy, yet there's no hiding from the fact that it remains one of Google's top three ranking factors. A common black hat technique is to automatically generate content to rank for a large number of keywords without actually going away and creating useful, unique content. 52 CU IDOL SELF LEARNING MATERIAL (SLM)

An example would be a multitude of location pages created, and the same content is used for each one, except for the place name changing. Be sure to take the time to create SEO-friendly content to avoid issues caused by low quality or duplicate pages. 3. Hidden Text : Hidden text is text that is the same color as the background that is positioned off the screen or behind an image, purposefully uses CSS to hide it from users, or even uses a font size of zero. This is deceptive but is sometimes used to stuff keywords in; many marketers would provide long lists of keywords that they wanted their content to rank for in SERPs. But what we're talking about here is a clear attempt to hide text completely, and this doesn't apply to text that's in an accordion, in tabs, or is loaded dynamically using JavaScript. From our side, we definitely don't recommend adding hidden text to your pages. Search engine crawlers are far more sophisticated now and understand that you're trying to cram in keywords. 4. Doorway/Gateway Pages : Creating pages that target specific search queries with content intended to act only as a funnel to one page is considered a violation of Google's guidelines. These types of pages are known as doorway or gateway pages. Every piece of content on your site should have a specific purpose, and you shouldn't be creating pages in an attempt to rank for keywords that aren't entirely relevant. Examples of this include:  Creating pages to target geographically targeted keywords in locations where your business doesn't have a physical presence that funnels users to a single page  Pages created solely to rank for search queries rather than to meet a user's need  Create content for humans, not search engines. 5. Cloaking: Cloaking is a tactic that involves serving different content or URLs to users and search engines, essentially providing a different experience on each. This is a clear attempt to rank a page based on content created for search engines while pointing users to somewhere (or something) different. This is a deceptive practice, making it a violation of search engine guidelines. Focus your efforts on designing the best possible experiences for your users, and there's a good chance the search engines will also love your page. 6. Paid/Manipulative Links : 53 CU IDOL SELF LEARNING MATERIAL (SLM)

Link schemes are one of the most common types of black hat SEO, and this is the area where a lot of confusion often originates. It's common sense to many marketers that you should be writing content that works for your users and that you shouldn't be hiding text, but link building gets a little more complex. The bottom line is that links should be earned, especially when considering that they're editorial votes of trust from one website to another. This means that you should avoid tactics such as:  Paid (sponsored) links that don't contain a rel=\"no follow\" or rel=\"sponsored\" attribute  Excessive link exchanges  Blog comment spam  Forum spam  Large-scale article marketing or guest posting campaigns  Automated link building  Spammy directories, bookmarking sites, and web 2.0 properties  Site-wide footer or sidebar links  Links that use exact match or commercial anchor text 7. Misused Structured Data and Rich Snippets : While structured data can help define entities, actions, and relationships online, a common black hat tactic is to abuse or misuse this type of markup. This usually means using structured data to give factually incorrect information—for example, those who attempt to create more favorable structured data for their site. Many marketers write up fake reviews that give 5-star ratings to boost their business' SERP position and enjoy a higher CTR. Like the other tactics on this list, this is pure deception and not a tactic you should consider. 8. Misleading Redirects : Whether it's an older page that you're updating to a new URL or preparing for a site migration, using redirects is a common part of SEO. There's nothing wrong with this; it's the preferred method of ensuring your site is well organized and easily accessible by users and search engine crawlers. However, similar to cloaking, sneaky redirects are placed by black hat SEOs to deceive search engines and display content that's different from what a user sees. Oftentimes a search engine will index the original page, while users are taken to a different destination URL. Google's Webmaster Guidelines specifically list sneaky redirects as a black hat tactic that violates its guidelines 54 CU IDOL SELF LEARNING MATERIAL (SLM)

9. Negative SEO : It would be wrong to assume that all black hat SEO tactics target the site that a marketer is trying to rank. Some unethical SEOs use negative SEO in an attempt to reduce their competitor's rankings. Think of this as using tactics that violate Google's guidelines on someone else's site, rather than your own. In practice, this commonly means pointing large numbers of unnatural links to someone else's domain in the hope that they're penalized because of it. While not very common, especially given that Google is getting better at ignoring the links that originate from such attacks, it's important to be aware of this and regularly analyze your link profile (something that can be done using the SEMrush Backlink Audit Tool). 5.2.3 How to Report Black Hat SEO: So, you may have asked yourself the following question reading through this guide: What if you spot one of your competitors using black hat tactics and it isn't being penalized? You can file a spam report with Google when you believe that a website is ranking due to paid links, spam, or other violations. While reporting a site won't result in direct action being taken, you're improving algorithmic spam detection. As a marketer, it's often disheartening to discover that a website is cheating the system and getting away with it. While Google is ultimately becoming increasingly good at preventing such sites from ranking in top positions on the SERPs, there are still sites performing well by leveraging black hat tactics. Depending on the severity of the web spam that other sites are using, there's a good chance that they'll be negatively impacted in the not too distant future after another algorithm updates. 5.3 WHITE HAT SEO White Hat SEO refers that work within search engines terms of service to improve a site’s search engine results page (SERP) rankings while maintaining the integrity of your website and staying within the search engines’ terms of service. For Example: Offering quality content and services, Fast site loading times and mobile- friendliness, Using descriptive, keyword-rich Meta tags, making your site easy to navigate, etc. 55 CU IDOL SELF LEARNING MATERIAL (SLM)

5.3.1 Why Is White Hat SEO Important? White hat SEO is important because, without it, search engine results would be chaotic. Without rules to guide the internet, website owners would rely on more dated SEO methods to rank on the search pages. And users would have to comb through a lot of irrelevant sites to find what they’re looking for. White hat SEO is important because it benefits everyone. Google makes sure its algorithms only rank great content that captures user intent for every keyword search because it encourages millions of internet users to use the search engine. Site owners also benefit because they can boost their ranking without resorting to dishonest tactics. Lastly, users benefit because they can find what they’re looking for easily through organic search. 5.3.2 What Are Some White Hat SEO Strategies? 1. Offer Valuable Content: Blog posts are used by 86% of marketing teams as a lead generator. Content is still king. So, don’t produce content for search engines. Create content for people looking for information. You’ll also want your content to be relevant and authoritative so that others will use it for reference. To get a quick view of your website’s overall search engine presence regarding organic traffic and the number of backlinks, use our Domain Overview tool. Figure 5.2 2. Satisfy User Intent Satisfying user intent is Google’s No. 1 goal, so you should also aim for it as a website owner. User intent is the person’s goal when they type a query into a search engine. For 56 CU IDOL SELF LEARNING MATERIAL (SLM)

example, if they type “keto pasta recipes,” they’d expect the results to show them keto diet-approved pasta recipes. The common types of user intent are informational, commercial, navigational, and transactional. If a person visits a clothing website, it’s safe to say they’re looking to buy, and that would be categorized as commercial and transactional. If someone looks up “how to grow mushrooms,” they’d be looking for information. A few strategies to use to satisfy user intent include:  Knowing your audience and what they’re looking for  Optimizing your content to fit specific keywords  Formatting your content for skimmers — use subheadings and leave a lot of white space  Using videos and images 3. Make Your Website Mobile-Friendly: In 2019, 52.2% of worldwide traffic used mobile phones to access the internet. This means if your website is not optimized to be viewed on a smartphone, you’re losing potential leads. Your website visitors won’t stay if they can’t navigate your site. Type your URL into the search box of Google’s mobile-friendly test tool. You’ll get an answer on whether your website can be viewed on mobile phones. You’ll also receive a screenshot of how your pages look on a phone screen with a list of recommendations for optimization. 5.4 DIFFERENCE BETWEEN BLACK HAT SEO AND WHITE HAT SEO Black Hat SEO White Hat SEO Black Hat SEO refers to the use of aggressive White Hat SEO refers to the use of optimization SEO tactics and strategies that focus only on SEO tactics and strategies that focus more on search engine not on human audience. human audience as opposed to search engine and completely follow search engines rules and policies It is used by those who are looking for quick It improves your search performance on search financial return on their website engine result page (SERP) along maintaining the integrity of the website. It contain stuff and spam keywords into the on- It contain properly research, craft titles, Meta tags 57 CU IDOL SELF LEARNING MATERIAL (SLM)

page contents to fool the search engine spiders according to webpage, industry, relevance and improve ranking. It consists of irrelevant back links. It get the link because of quality content It exchange the links for the ranking. It consists of natural links. It is also known as Unethical SEO. It is also known as Ethical SEO. It is used for short term goals and benefits. It is used for long term goal and benefits. Table 5.1 5.5 SUMMARY  There are significant risks involved with using black hat tactics to rank your website, and that's the reason why most SEOs choose not to consider such approaches. The majority of the SEO industry deems these practices to be completely unethical.  A common black hat technique is to automatically generate content to rank for a large number of keywords without actually going away and creating useful, unique content. An example would be a multitude of location pages created, and the same content is used for each one, except for the place name changing. Creating pages that target specific search queries with content intended to act only as a funnel to one page is considered a violation of Google's guidelines. These types of pages are known as doorway or gateway pages.  Sneaky redirects are placed by black hat SEOs to deceive search engines and display content that's different from what a user sees. Oftentimes a search engine will index the original page, while users are taken to a different destination URL.  You can file a spam report with Google when you believe that a website is ranking due to paid links, spam, or other violations. While reporting a site won't result in direct action being taken, you're improving algorithmic spam detection. 5.6 KEYWORDS  Report-an account given of a particular matter, especially in the form of an official document, after thorough investigation or consideration by an appointed person or body.  Structured Data - to data that resides in a fixed field within a file or record.  Blog - a regularly updated website or web page, typically one run by an individual or small group, that is written in an informal or conversational style. 58 CU IDOL SELF LEARNING MATERIAL (SLM)

 Manipulative– serving or intended to control or influence others in an artful and often unfair or selfish way a clever and manipulative child manipulative behavior.  SERP -is the page you see after entering a query into Google 5.7 LEARNING ACTIVITY 1. Define Black Hat SEO ___________________________________________________________________________ ___________________________________________________________________________ 2. Define White Hat SEO ___________________________________________________________________________ ___________________________________________________________________________ 5.8 UNIT END QUESTIONS A. Descriptive Questions Short Questions: 1. State reasons To Avoid Black Hat SEO 2. Explain Doorway/Gateway Pages 3. Explain Paid/Manipulative Links 4. Why Is White Hat SEO Important? Long Questions 1. List and explain Black Hat SEO tactics to avoid. 2. Differentiate between Black Hat SEO and White hat SEO 3. Explain White Hat SEO Strategies. B. Multiple Choice Questions 1. Blog posts are used by 86% of marketing teams as a _______ a. lead generator b. marketing c. information d. survey 59 CU IDOL SELF LEARNING MATERIAL (SLM)

2. In 2019, _____ of worldwide traffic used mobile phones to access the internet. a. 55% b. 52.2% c. 60% d. 90% 3. Black Hat SEO It is used by those who are looking for quick ________ on their website. a. Customer b. Feedback c. Financial Return d. None of these 4. White Hat SEO is _______ SEO a. Main b. Primary c. Unethical d. Ethical 5. White Hat SEO improves your search performance on search engine result page(SERP) along maintaining the _____ of the website 60 CU IDOL SELF LEARNING MATERIAL (SLM)

a. Integrity b. Authenticity c. Confidentiality d. Privacy Answers 1-a, 2-b, 3-c, 4-d, 5-a 5.9 REFERENCES References book  Jerri Ledford, “Search Engine Optimization”, Wiley Publishing Inc.  S.S. Niranga, “Mobile Web Performance Optimization”, PACKT Publishing Textbook references  Danny Dover, “Search Engine Optimization Secrets”, Wiley Publishing Inc.  Bruce Clay, “Search Engine Optimization AA-In-One for Dummies A Wiley Brand”, John Wiley & Sons.  Aaron Matthew Wall, “Search Engine Optimization”, http://www.seobook.com/seo- tools.pdf  H.J. Bernardin, Human Resource Management, Tata McGraw Hill, New Delhi, 2004. Website  https://www.umassmed.edu/globalassets/it/web-services/google-analytics/google- analytics-user-guide.pdf  https://static.googleusercontent.com/media/www.google.com/en//grants/education/Go ogle_Analytics_Training.pdf  https://analytics.google.com/analytics/web/#/p287091400/reports/reportinghub  Internet Basics: What is the Internet? (gcfglobal.org)  https://www.semrush.com/blog/black-hat- seo/?kw=&cmp=IN_SRCH_DSA_Blog_Core_BU_EN&label=dsa_pagefeed&Networ k=g&Device=c&utm_content=515715717309&kwid=aud-422673590326:dsa- 61 CU IDOL SELF LEARNING MATERIAL (SLM)

1053501806627&cmpid=11773572684&agpid=115407097598&BU=Core&extid=20 3745429736&adpos=&gclid=Cj0KCQjww4OMBhCUARIsAILndv6WsS0Lw7OOGv i5TA6yAGFZNZNRGP44MNM3cnYj70oeQ3LiFot_Hn4aAt4aEALw_wcB 62 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT - 6: HTML STRUCTURE 6.0 Learning Objectives 6.1 Introduction 6.2 Basic HTML knowledge 6.3 The Analysis, Meta Tags 6.4 Creating Sitemaps 6.4.1 Sitemap Format 6.4.2 Sitemap extensions for additional media types 6.4.3 General sitemap guidelines 6.4.4 Create a sitemap 6.5 Summary 6.6 Keywords 6.7 Learning Activity 6.8 Unit End Questions 6.9 References 6.0 LEARNING OBJECTIVES After studying this unit, you will be able to:  Explain basic HTML tags.  Describe meta tags and how they are used  Create different HTML page using meat tags  Create sitemap by understanding sitemap guidelines 6.1 INTRODUCTION The HyperText Markup Language, or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript. Web browsers receive HTML documents from a web server or from local storage and render the documents into multimedia web pages. HTML describes the structure of a web page semantically and originally included cues for the appearance of the document. HTML elements are the building blocks of HTML pages. With HTML constructs, images and other objects such as interactive forms may be embedded into the rendered page. HTML 63 CU IDOL SELF LEARNING MATERIAL (SLM)

provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes and other items. HTML elements are delineated by tags, written using angle brackets. Tags such as <img /> and <input /> directly introduce content into the page. Other tags such as <p> surround and provide information about document text and may include other tags as sub-elements. Browsers do not display the HTML tags, but use them to interpret the content of the page. 6.2 BASIC HTML KNOWLEDGE HTML stands for Hyper Text Markup Language, which is the most widely used language on Web to develop web pages. HTML was created by Berners-Lee in late 1991 but \"HTML 2.0\" was the first standard HTML specification which was published in 1995. HTML 4.01 was a major version of HTML and it was published in late 1999. Though HTML 4.01 version is widely used but currently we are having HTML-5 version which is an extension to HTML 4.01, and this version was published in 2012. HTML is a MUST for students and working professionals to become a great Software Engineer specially when they are working in Web Development Domain. Some of the key advantages of learning HTML:  Create Web site - You can create a website or customize an existing web template if you know HTML well.  Become a web designer - If you want to start a carrer as a professional web designer, HTML and CSS designing is a must skill.  Understand web - If you want to optimize your website, to boost its speed and performance, it is good to know HTML to yield best results.  Learn other languages - Once you understands the basic of HTML then other related technologies like javascript, php, or angular are become easier to understand. HTML Tags As told earlier, HTML is a markup language and makes use of various tags to format the content. These tags are enclosed within angle braces <Tag Name>. Except few tags, most of the tags have their corresponding closing tags. For example, <html> has its closing tag </html> and <body> tag has its closing tag </body> tag etc. Above example of HTML document uses the following tags − Sr.No Tag & Description 1 <!DOCTYPE...> 64 This tag defines the document type and HTML version. CU IDOL SELF LEARNING MATERIAL (SLM)

2 <html> This tag encloses the complete HTML document and mainly comprises of document header which is represented by <head>...</head> and document body which is represented by <body>...</body> tags. 3 <head> This tag represents the document's header which can keep other HTML tags like <title>, <link> etc. 4 <title> The <title> tag is used inside the <head> tag to mention the document title. 5 <body> This tag represents the document's body which keeps other HTML tags like <h1>, <div>, <p> etc. 6 <h1> This tag represents the heading. 7 <p> This tag represents a paragraph. Table 6.1 To learn HTML, you will need to study various tags and understand how they behave, while formatting a textual document. Learning HTML is simple as users have to learn the usage of different tags in order to format the text or images to make a beautiful webpage. World Wide Web Consortium (W3C) recommends to use lowercase tags starting from HTML 4 HTML Document Structure: <html> <head> 65 Document header related tags CU IDOL SELF LEARNING MATERIAL (SLM)

</head> <body> Document body related tags </body> </html> 6.3 THE ANALYSISMETA TAGS HTML lets you specify metadata - additional important information about a document in a variety of ways. The META elements can be used to include name/value pairs describing properties of the HTML document, such as author, expiry date, a list of keywords, document author etc. The <meta> tag is used to provide such additional information. This tag is an empty element and so does not have a closing tag but it carries information within its attributes. You can include one or more meta tags in your document based on what information you want to keep in your document but in general, meta tags do not impact physical appearance of the document so from appearance point of view, it does not matter if you include them or not. Adding Meta Tags to Your Documents: You can add metadata to your web pages by placing <meta> tags inside the header of the document which is represented by <head> and </head> tags. A meta tag can have following attributes in addition to core attributes – Sr.No Attribute & Description 1 Name Name for the property. Can be anything. Examples include, keywords, description, author, revised, generator etc. 2 content Specifies the property's value. 3 scheme 66 CU IDOL SELF LEARNING MATERIAL (SLM)

Specifies a scheme to interpret the property's value (as declared in the content attribute). 4 http-equiv Used for http response message headers. For example, http-equiv can be used to refresh the page or to set a cookie. Values include content-type, expires, refresh and set-cookie. Table 6.2 Specifying Keywords: You can use <meta> tag to specify important keywords related to the document and later these keywords are used by the search engines while indexing your webpage for searching purpose. Example Following is an example, where we are adding HTML, Meta Tags, and Metadata as important keywords about the document. : <!DOCTYPE html> <html> <head> <title>Meta Tags Example</title> <meta name = \"keywords\" content = \"HTML, Meta Tags, Metadata\" /> </head> <body> <p>Hello HTML5! </p> </body> </html> Document Description: You can use <meta> tag to give a short description about the document. This again can be used by various search engines while indexing your webpage for searching purpose. 67 CU IDOL SELF LEARNING MATERIAL (SLM)

Example: <!DOCTYPE html> <html> <head> <title>Meta Tags Example</title> <meta name = \"keywords\" content = \"HTML, Meta Tags, Metadata\" /> <meta name = \"description\" content = \"Learning about Meta Tags.\" /> </head> <body> <p>Hello HTML5! </p> </body> </html> Document Revision Date: You can use <meta> tag to give information about when last time the document was updated. This information can be used by various web browsers while refreshing your webpage Example: <!DOCTYPE html> <html> <head> <title>Meta Tags Example</title> <meta name = \"keywords\" content = \"HTML, Meta Tags, Metadata\" /> <meta name = \"description\" content = \"Learning about Meta Tags.\" /> <Meta name = \"revised\" content = \"Tutorials point, 3/7/2014\" /> </head> <body> 68 CU IDOL SELF LEARNING MATERIAL (SLM)

<p>Hello HTML5! </p> </body> </html> Document Refreshing: A <meta> tag can be used to specify a duration after which your web page will keep refreshing automatically. Example If you want your page keep refreshing after every 5 seconds then use the following syntax. <!DOCTYPE html> <html> <head> <title>Meta Tags Example</title> <meta name = \"keywords\" content = \"HTML, Meta Tags, Metadata\" /> <meta name = \"description\" content = \"Learning about Meta Tags.\" /> <Meta name = \"revised\" content = \"Tutorials point, 3/7/2014\" /> <meta http-equiv = \"refresh\" content = \"5\" /> </head> <body> <p>Hello HTML5! </p> </body> </html> Page Redirection: You can use <meta> tag to redirect your page to any other webpage. You can also specify a duration if you want to redirect the page after a certain number of seconds. Example Following is an example of redirecting current page to another page after 5 seconds. If you want to redirect page immediately then do not specify content attribute. 69 CU IDOL SELF LEARNING MATERIAL (SLM)

<!DOCTYPE html> <html> <head> <title>Meta Tags Example</title> <meta name = \"keywords\" content = \"HTML, Meta Tags, Metadata\" /> <meta name = \"description\" content = \"Learning about Meta Tags.\" /> <Meta name = \"revised\" content = \"Tutorials point, 3/7/2014\" /> <meta http-equiv = \"refresh\" content = \"5; url = http://www.tutorialspoint.com\" /> </head> <body> <p>Hello HTML5! </p> </body> </html> 6.4 CREATING SITEMAPS This section describes how to build a sitemap and make it available to Google. 1. Decide which sitemap format you want to use. 2. Create the sitemap, either automatically or manually. 3. Make your sitemap available to Google by adding it to your robots.txt file or directly submitting it to Search Console. 6.4.1 Sitemap formats: Google supports several sitemap formats: • XML • RSS, mRSS, and Atom 1.0 • Text Google expects the standard sitemap protocol in all formats. Google does not currently consume the <priority> attribute in sitemaps. 70 CU IDOL SELF LEARNING MATERIAL (SLM)

All formats limit a single sitemap to 50MB (uncompressed) and 50,000 URLs. If you have a larger file or more URLs, you will have to break your list into multiple sitemaps. You can optionally create a sitemap index file (a file that points to a list of sitemaps) and submit that single index file to Google. You can submit multiple sitemaps and/or sitemap index files to Google. 1. XML : Here is a very basic XML sitemap that includes the location of a single URL: <?xml version=\"1.0\" encoding=\"UTF-8\"?> <urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\"> <url> <loc>http://www.example.com/foo.html</loc> <lastmod>2018-06-04</lastmod> </url> </urlset> You can find more complex examples and full documentation at sitemaps.org. You can see examples of sitemaps that specify alternate language pages and sitemaps for news, image, or video files. 2. RSS, mRSS, and Atom 1.0 : If you have a blog with an RSS or Atom feed, you can submit the feed's URL as a sitemap. Most blog software is able to create a feed for you, but recognize that this feed only provides information on recent URLs. • Google accepts RSS 2.0 and Atom 1.0 feeds. • You can use an mRSS (media RSS) feed to provide Google details about video content on your site. 3. TEXT : If your sitemap includes only web page URLs, you can provide Google with a simple text file that contains one URL per line. For example: http://www.example.com/file1.html http://www.example.com/file2.html Guidelines for text file sitemaps • Encode your file using UTF-8 encoding. • Don't put anything other than URLs in the sitemap file. • You can name the text file anything you wish, provided it has a .txt extension (for instance, sitemap.txt). 6.4.2 Sitemap extensions for additional media types: Google supports extended sitemap syntax for the following media types. Use these extensions to describe video files, images, and other hard-to-parse content on your site to improve indexing. 71 CU IDOL SELF LEARNING MATERIAL (SLM)

 Video  Images  Google News 6.4.3 General Sitemap guidelines:  Use consistent, fully-qualified URLs. Google will crawl your URLs exactly as listed. For instance, if your site is at https://www.example.com/, don't specify a URL as https://example.com/ (missing www) or ./mypage.html (a relative URL).  A sitemap can be posted anywhere on your site, but a sitemap affects only descendants of the parent directory. Therefore, a sitemap posted at the site root can affect all files on the site, which is where we recommend posting your sitemaps.  Don't include session IDs from URLs in your sitemap. This reduces duplicate crawling of those URLs.  Tell Google about alternate language versions of a URL using hreflang annotations.  Sitemap files must be UTF-8 encoded, and URLs escaped appropriately.  Break up large sitemaps into smaller sitemaps: a sitemap can contain up to 50,000 URLs and must not exceed 50MB uncompressed. Use a sitemap index file to list all the individual sitemaps and submit this single file to Google rather than submitting individual sitemaps.  List only canonical URLs in your sitemaps. If you have two versions of a page, list in the sitemap only the one you prefer to appear in search results. If you have two versions of your site (for example, www and non-www), decide which is your preferred site, and put the sitemap there, and add rel=canonical or redirects on the other site.  If you have different URLs for mobile and desktop versions of a page, we recommend pointing to only one version in a sitemap. However, want to point to both URLs, annotate your URLs to indicate the desktop and mobile versions.  Use sitemap extensions for pointing to additional media types such as video, images, and news.  If you have alternate pages for different languages or regions, you can use hreflang in either a sitemap or html tags to indicate the alternate URLs.  Non-alphanumeric and non-Latin characters. We require your sitemap file to be UTF- 8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the following table. A sitemap can contain only ASCII characters; it can't contain extended ASCII characters or certain control codes or special characters such as * and {}.  If your sitemap URL contains these characters, you'll receive an error when you try to add it. 72 CU IDOL SELF LEARNING MATERIAL (SLM)

Character Symbol Escape Code Ampersand & &amp; Single Quote ' &apos; Double Quote \" &quot; Greater Than > &gt; Less Than < &lt; Table 6.3  In addition, all URLs (including the URL of your sitemap) must be encoded for readability by the web server on which they are located and URL-escaped. However, if you are using any sort of script, tool, or log file to generate your URLs (anything except typing them in by hand), this is usually already done for you. If you submit your sitemap and you receive an error that Google is unable to find some of your URLs, check to make sure that your URLs follow the RFC- 3986 standard for URIs, the RFC-3987 standard for IRIs, and the XML standard.  Here is an example of a URL that uses a non-ASCII character (ü), as well as a character that requires entity escaping (&):  http://www.example.com/ümlat.html&q=name  Here is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that encoding) and URL escaped:  http://www.example.com/%FCmlat.html&q=name  Here is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding) and URL escaped:  http://www.example.com/%C3%BCmlat.html&q=name  Here is that same URL, entity escaped:  http://www.example.com/%C3%BCmlat.html&amp;q=name  Remember that sitemaps are a recommendation to Google about which pages you think are important; Google does not pledge to crawl every URL in a sitemap.  Google ignores <priority> and <changefreq> values.  Google uses the <lastmod> value if it's consistently and verifiably (for example by comparing to the last modification of the page) accurate.  The position of a URL in a sitemap is not important; Google does not crawl URLs in the order in which they appear in your sitemap. 73 CU IDOL SELF LEARNING MATERIAL (SLM)

6.4.4 Create a sitemap When creating a sitemap, you're telling search engines about which URLs you prefer to show in search results. These are the canonical URLs. If you have the same content accessible under different URLs, choose the URL you prefer and include that in the sitemap instead of all URLs that lead to the same content. Once you've decided which URLs to include in the sitemap, pick one of the following ways to create a sitemap, depending on your site architecture and size: • Let your CMS generate a sitemap for you. • For sitemaps with less than a few dozen URLs, you can manually create a sitemap. • For sitemaps with more than a few dozen URLs, automatically generate a sitemap. Let your CMS generate a sitemap for you If you're using a CMS such as WordPress, Wix, or Blogger, it's likely that your CMS has already made a sitemap available to search engines. Try searching for information about how your CMS generates sitemaps, or how to create a sitemap if your CMS doesn't generate a sitemap automatically. For example, in case of Wix, search for \"wix sitemap\". For all other site setups, you will need to generate the sitemap yourself. Manually create a sitemap For sitemaps with less than a few dozen URLs, you may be able to manually create a sitemap. For this, open a text editor such as Windows Notepad or Nano (Linux, MacOS), and follow a syntax described in the Sitemap Formats section. You can manually create larger sitemaps, but it's a tedious process. Automatically generate a sitemap For sitemaps with more than a few dozen URLs, you will need to generate the sitemap. There are various tools that can generate a sitemap. However, the best way is to have your website software generate it for you. For example, you can extract your site's URLs from your website's database and then export the URLs to either the screen or actual file on your web server. Talk to your developers or server manager about this solution. If you need inspiration for the code, check out our old collection of third-party sitemap generators. Keep in mind that sitemaps can't be larger than 50 MB. Learn more about managing large sitemaps. 6.5 SUMMARY  HTML stands for Hyper Text Markup Language, which is the most widely used language on Web to develop web pages. HTML was created by Berners-Lee in late 74 CU IDOL SELF LEARNING MATERIAL (SLM)

1991 but \"HTML 2.0\" was the first standard HTML specification which was published in 1995.  HTML is a markup language and makes use of various tags to format the content. These tags are enclosed within angle braces <Tag Name>. Except few tags, most of the tags have their corresponding closing tags.  To learn HTML, you will need to study various tags and understand how they behave, while formatting a textual document. Learning HTML is simple as users have to learn the usage of different tags in order to format the text or images to make a beautiful webpage.  HTML lets you specify metadata - additional important information about a document in a variety of ways. The META elements can be used to include name/value pairs describing properties of the HTML document, such as author, expiry date, a list of keywords, document author etc.  The <meta> tag is used to provide such additional information. This tag is an empty element and so does not have a closing tag but it carries information within its attributes.  You can add metadata to your web pages by placing <meta> tags inside the header of the document which is represented by <head> and </head> tags. 6.6 KEYWORDS  Meta -(of a creative work) referring to itself or to the conventions of its genre; self- referential.  Scheme-Specifies a scheme to interpret the property's value (as declared in the content attribute).  Page Redirection - a technique for moving visitors to a different Web page than the one they request.  Sitemaps– is a list of pages of a web site within a domain. ... Site maps used during the planning of a Web site by its designers. Human-visible listings, typically hierarchical, of the pages on a site.  XML-stands for extensible markup language. A markup language is a set of codes, or tags that describes the text in a digital document. 6.7LEARNING ACTIVITY 1. List advantages of learning HTML ___________________________________________________________________________ ___________________________________________________________________________ 75 CU IDOL SELF LEARNING MATERIAL (SLM)

2. List and explain any 5 HTML tags ___________________________________________________________________________ ___________________________________________________________________________ 6.8 UNIT END QUESTIONS A. Descriptive Questions Short Questions 1. Explain Document Description procedure using HTML tag. 2. Explain with example how document refreshing is done using HTML. 3. Explain with example how page redirection is done using HTML 4. Explain XML. Long Questions 1. Explain general sitemap guidelines. 2. Explain how to create sitemap. 3. Write a HTML program to create menu for a coffee station. 4. Explain TEXT and RSS, mRSS, and Atom 1.0 B. Multiple Choice Questions 1. Google accepts RSS a. 2.0 b. 2 c. 2.1 d. 1.0 2. Encode your file using ______encoding 76 a. UTF-7 b. UTF-8 CU IDOL SELF LEARNING MATERIAL (SLM)

c. UTF-6 77 d. UTF-5 3. Make your sitemap available to Google by adding it to your _________ a. Meta.txt b. Data.txt c. robots.txt d. console.txt 4. XML includes the location of a ______ URL a. Multiple b. Main c. Two d. Single 5. A sitemap can contain up to ________ URLs a. 50,000 b. 5,000 c. 2 d. 1 CU IDOL SELF LEARNING MATERIAL (SLM)

Answers 1-a, 2-b, 3-c, 4-d, 5-a 6.9 REFERENCES References book  Jerri Ledford, “Search Engine Optimization”, Wiley Publishing Inc.  S.S. Niranga, “Mobile Web Performance Optimization”, PACKT Publishing Textbook references  Danny Dover, “Search Engine Optimization Secrets”, Wiley Publishing Inc.  Bruce Clay, “Search Engine Optimization AA-In-One for Dummies A Wiley Brand”, John Wiley & Sons.  Aaron Matthew Wall, “Search Engine Optimization”, http://www.seobook.com/seo- tools.pdf  H.J. Bernardin, Human Resource Management, Tata McGraw Hill, New Delhi, 2004. Website  https://www.umassmed.edu/globalassets/it/web-services/google-analytics/google- analytics-user-guide.pdf  https://static.googleusercontent.com/media/www.google.com/en//grants/education/Go ogle_Analytics_Training.pdf  https://analytics.google.com/analytics/web/#/p287091400/reports/reportinghub  Internet Basics: What is the Internet? (gcfglobal.org)  https://www.semrush.com/blog/black-hat- seo/?kw=&cmp=IN_SRCH_DSA_Blog_Core_BU_EN&label=dsa_pagefeed&Networ k=g&Device=c&utm_content=515715717309&kwid=aud-422673590326:dsa- 1053501806627&cmpid=11773572684&agpid=115407097598&BU=Core&extid=20 3745429736&adpos=&gclid=Cj0KCQjww4OMBhCUARIsAILndv6WsS0Lw7OOGv i5TA6yAGFZNZNRGP44MNM3cnYj70oeQ3LiFot_Hn4aAt4aEALw_wcB 78 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT - 7: OPTIMIZATION TOOL STRUCTURE 7.7 Learning Objectives 7.8 Introduction 7.9 Optimize SEO Content 7.10 Keyword Research 7.11 Page Speed Optimization Tool 7.12 Anchor Links Optimization 7.13 Internal Link Strategy 7.14 Summary 7.15 Keywords 7.16 Learning Activity 7.17 Unit End Questions 7.18 References 7.0 LEARNING OBJECTIVES After studying this unit, you will be able to:  Explain the method of optimization of SEO  Explain how keyword is searched in more effective manner.  Describe how effectively the page optimization speed can be set up.  Explain internal link strategy, problems and solution related to it. 7.1 INTRODUCTION To understand what SEO content means, it’s useful to look at the phrase in two parts: The first part – “SEO” (search engine optimization) – is the process of optimizing your website and content so that it shows up higher in search engine results pages for specific search terms. The second part – “content” – is any information that you publish online that can be indexed by search engines. This includes website content, blog posts, images, graphics, and videos. So, taken as a whole, SEO content is any content that is created to increase search engine rankings and therefore traffic to your website. 79 CU IDOL SELF LEARNING MATERIAL (SLM)

Search engines display their organic search results according to the relevance and authority of a web page. Relevance is determined by how often you use specific keywords and phrases within the content of a web page, and authority is determined by the number of trustworthy backlinks that point to that page. To make your content more SEO-friendly, your content should be organized logically, contain relevant keywords, and be written with your audience in mind. 7.2 OPTIMIZE SEO CONTENT Search engine optimization (or SEO) is a continually changing field. But throughout all recent changes, one thing has remained constant, and that’s the importance of content to SEO. Content and SEO go hand-in-hand. People use search engines to find answers or solutions to their questions and search engines serve up the most relevant content they can find. While the top search result might be a blog post, a YouTube video or product description – it’s all contents There is a deep relationship between content and SEO, and it can be difficult to understand the nuances of they relate. This guide explores that relationship and provides actionable insights for how businesses can use content to support their SEO efforts. To understand what SEO content means, it’s useful to look at the phrase in two parts:  The first part – “SEO” (search engine optimization) – is the process of optimizing your website and content so that it shows up higher in search engine results pages for specific search terms.  The second part – “content” – is any information that you publish online that can be indexed by search engines. This includes website content, blog posts, images, graphics, and videos.  So, taken as a whole, SEO content is any content that is created to increase search engine rankings and therefore traffic to your website. Search engines display their organic search results according to the relevance and authority of a web page. Relevance is determined by how often you use specific keywords and phrases within the content of a web page, and authority is determined by the number of trustworthy backlinks that point to that page. To make your content more SEO-friendly, your content should be organized logically, contain relevant keywords, and be written with your audience in mind. 7.3 KEYWORD RESEARCH Keyword research is the process of finding and analyzing search terms that people enter into search engines with the goal of using that data for a specific purpose, often for search engine 80 CU IDOL SELF LEARNING MATERIAL (SLM)

optimization (SEO) or general marketing. Keyword research can uncover queries to target, the popularity of theses queries, their ranking difficulty, and more. Keyword research provides valuable insight into the queries that your target audience is actually searching on Google. The insight that you can get into these actual search terms can help inform content strategy as well as your larger marketing strategy. However, keywords themselves may not be as important to SEO as you may think. Keyword research tells you what topics people care about and, assuming you use the right SEO tool, how popular those topics actually are among your audience. The operative term here is topics -- by researching keywords that are getting a high volume of searches per month, you can identify and sort your content into topics that you want to create content on. Then, you can use these topics to dictate which keywords you look for and target. How to Research Keywords for Your SEO Strategy Step 1: Make a list of important, relevant topics based on what you know about your business. Before choosing keywords and expecting your content to rank for them, you must curate keywords for three things:  Relevance  Authority  Volume Step 2: Fill in those topic buckets with keywords. Step 3: Understand How Intent Affects Keyword Research and Analyze Accordingly. Step 4: Research related search terms. Step 5: Use keyword research tools to your advantage. 7.4 PAGE SPEED OPTIMIZATION TOOL Page Speed Insights (PSI) reports on the performance of a page on both mobile and desktop devices, and provides suggestions on how that page may be improved. Page speed has always been a crucial part of SEO work, and as more companies make the shift to online operations, optimization becomes more important than ever. However, it's a complex subject that tends to be very technical. What are the most crucial things to understand about your site's page speed, and how can you begin to improve? 81 CU IDOL SELF LEARNING MATERIAL (SLM)

Figure 7.1 What are some of the most common issues that could slow down a website? 1. First and foremost is images. Large images are the biggest culprit of slow loading web pages. 2. Hosting can cause issues. 3. Plugins, apps, and widgets, basically any third-party script as well can slow down load time. 4. Your theme and any large files beyond that can really slow things down as well. 5. Redirects, the number of hops needed to get to a web page will slow things down. 6. Then JavaScript, which we'll get into in a second. Solution to above questions: Test your mobile website speed and performance using tools like testmysite.thinkwithgoogle.com from google. Pingdom and GTmetrix are non-Google products or non-Google tools, but super helpful as well. Page Speed Insights is really interesting. They've now incorporated Chrome User Experience Report. But if you're not one of those large sites, it's not even going to measure your actual page speed. It's going to look at how your site is configured and provide feedback according to that and score it. Just something good to be aware of. It still provides good value. HTTP/2 can definitely speed things up. As to what extent, you have to sort of research that and test. 82 CU IDOL SELF LEARNING MATERIAL (SLM)

Preconnect, prefect, and preload really interesting and important in speeding up a site. Enable caching & use a content delivery network (CDN) The easiest and probably quickest way for you to speed up your site today is really just to compress those images. 7.5 ANCHOR LINKS OPTIMIZATION Quality links are one of the most important ingredients of a healthy SEO strategy. They help Google, and other search engines measure the relevance of websites and return better results to searchers. If you want to improve your search engine rankings, you need quality links. But developing a strong link profile is not all that simple; first, you need to understand what makes a good link, what makes a bad link, and what you can do to optimize your website for better results. Every web page should have internal links (pointing to the same page or another page on the same domain) and outbound links (pointing to external web domains). They help users navigate a site and find useful information, and help search engines understand and index a site. But they are all formed differently. • Anchor texts. These are often highlighted, clickable links that can help increase search rankings for specific keywords. However, Google punishes repetitive anchor texts, so use a mixture of non-repetitive branded and keyword-rich phrases. • Naked URLs. This is when the full URL is displayed in the link. Generally, they’re not as powerful as anchor texts for SEO. • Brand citations. Instead of showing the full URL of a company website, the link is merely the name of the company. • An image link can be an excellent navigational tool, but only when it’s a link. (The alt attribute tag for the image acts like anchor text.) • Reciprocal links. These happen when two webmasters agree to provide a hyperlink to each other’s website. If they both share the same target market or offer complimentary services or products, the link can be seen as relevant. If not, reciprocal links can harm PageRank. 7.6 INTERNAL LINK STRATEGY Internal links are those that point from one page to another on your site. External links, by contrast, are ones that point to a page on another domain. Importance of Internal Links for SEO: 1. Internal Links Help Search Engines Understand Your Site’s Structure 2. Internal Links Pass Authority 83 CU IDOL SELF LEARNING MATERIAL (SLM)

3. Internal Links Help Users to Navigate Between Relevant Pages Types of Internal Links: 1. Navigational Internal Links : Navigational links typically make up a website’s main navigational structure. They are often implemented site-wide and serve the primary purpose of helping users find what they want. 2. Contextual Internal Links: Contextual internal links are typically placed within the main body content of a page. Common Internal Link Problems 1. Broken Internal Links: The Problem: Broken internal links result in both users and search engine crawlers being sent to non-existent web pages, usually resulting in 404 errors, which is not ideal for communicating authority. How to Fix: Either remove or replace the link with one that points to a live page to resolve it 2. Links Couldn't be Crawled : The Problem: This error occurs when the format of a URL is incorrect. For example, it might contain unnecessary characters. How to Fix: Check the links reported as errors and fix formatting issues as necessary. 3. Too Much On-page Internal Linking The Problem: When a page contains more than 3,000 links, it will be flagged in the Site Audit report. There is no specific rule on how many on-page links Google will crawl these days, but webmasters need to be mindful of overloading pages from a usability perspective. How to Fix: Audit any pages that are found to contain more than 3,000 links and remove those that are surplus to requirements 4. No follow Attributes in Outgoing Internal Links The Problem: The rel=“no follow” attribute in links on certain pages is restricting Googlebot’s flow through your site. How to Fix: Remove the rel=“no follow” attribute from any internal links flagged in the report. This may be set site-wide or on a link-by-link basis - check with your developer if necessary 5. Orphaned Sitemap Pages The Problem: An orphaned page is one that is not linked to at all from any other page on your site, which means it can’t be accessed in a crawl and can’t be indexed. How to Fix: If an orphaned page could be valuable, include it in your internal linking strategy. If it should not exist or be ranked by search engines, consider either removing it, or adding a “no index” tag. 6. Page Crawl Depth of More Than Three Clicks 84 CU IDOL SELF LEARNING MATERIAL (SLM)

The Problem: Some important pages take too many clicks for users to reach, which indicates to search engines that they are not that important. How to Fix: Work out where you can lose certain clicks to help users get to the content they want quicker. 7. Pages with Only One Internal Link The Problem: Solitary internal links can mean missed opportunities in both SEO and UX. As we’ve discussed above, you should be internally linking to key pages from other relevant content as much as it’s naturally possible to do so. How to Fix: Identify other relevant pages to link to as part of your internal linking strategy. 8. Permanent Redirects The Problem: Passing internal links through permanent redirects can reduce your crawl budget, especially for larger sites. How to Fix: Update internal links to send users and search engines directly to the destination page (don’t remove the redirect, though, if it is still attracting traffic from other sources). 9. Redirect Chains and Loops The Problem: Internal links that trigger redirect chains and loops are difficult for search engines to crawl. They also create a poor UX. How to Fix: As above, update internal links so they point to the correct live page. Additionally, remove intermediary redirects in a chain (update the redirect to go from the origin page to the end of the redirect path) or find the cause of loops. 10. Links on HTTPS Pages Lead to HTTP pages The Problem: URLs that mistakenly point to HTTP pages on secure sites can cause unnecessary redirects. How to Fix: Manually update any HTTP links to point to HTTPS pages if it is on a small scale; ask your developer for help if it is site-wide How to Build Your Internal Linking Strategy Step 1: Identifying Your Site’s Hub Pages Step 2: Creating Topic Clusters Using Internal Links Step 3: Choosing the Right Anchor Text Step 4: Identifying Your Site’s Authority Pages Step 5: Using Internal Links to Increase the Ranking of Target Pages Step 6: Using Internal Links to Optimize Fresh Content Good links include:  Backlinks from websites with more authority.  Backlinks from relevant websites 85 CU IDOL SELF LEARNING MATERIAL (SLM)

 Backlinks from websites that already rank high in SERPs for certain keywords.  Anchor text links that are relevant to the content they are a part of and link to. Bad links include:  Links to websites or link directories with little useful content.  Links to websites that are not related to the content of your website.  Links to poor-quality websites.  Broken links.  Repetitive keywords or exact-match keyword phrases in internal links.  Purchased links without the no follow tag. 7.7 SUMMARY  The first part – “SEO” (search engine optimization) – is the process of optimizing your website and content so that it shows up higher in search engine results pages for specific search terms.  The second part – “content” – is any information that you publish online that can be indexed by search engines. This includes website content, blog posts, images, graphics, and videos.  Content and SEO go hand-in-hand. People use search engines to find answers or solutions to their questions and search engines serve up the most relevant content they can find. While the top search result might be a blog post, a YouTube video or product description – it’s all contents  Keyword research is the process of finding and analyzing search terms that people enter into search engines with the goal of using that data for a specific purpose, often for search engine optimization (SEO) or general marketing. Keyword research can uncover queries to target, the popularity of theses queries, their ranking difficulty, and more.  Page Speed Insights (PSI) reports on the performance of a page on both mobile and desktop devices, and provides suggestions on how that page may be improved.  Quality links are one of the most important ingredients of a healthy SEO strategy. They help Google, and other search engines measure the relevance of websites and return better results to searchers. If you want to improve your search engine rankings, you need quality links. 7.8 KEYWORDS  Links- is a reference to data that the user can follow by clicking or tapping.  Cluster - is a set of computers that work together so that they can be viewed as a single system. 86 CU IDOL SELF LEARNING MATERIAL (SLM)

 Anchor Text- s the visible characters and words that hyperlinks display when linking to another document or location on the web  Crawl Depth– is the extent to which a search engine indexes pages within a website. Most sites contain multiple pages, which in turn can contain subpages.  URL- is an address that shows where a particular page can be found on the World Wide Web. 7.9LEARNING ACTIVITY 1. List internal good links. ___________________________________________________________________________ ___________________________________________________________________________ 2. List internal bad links. ___________________________________________________________________________ ___________________________________________________________________________ 7.10 UNIT END QUESTIONS A. Descriptive Questions Short Questions 1. What do you mean by Optimize SEO content? 2. How to Research Keywords for Your SEO Strategy? 3. How to Build Your Internal Linking Strategy 4. Explain the following problems and solutions: a. Links Couldn't be Crawled b. Too Much On-page Internal Linking c. No follow Attributes in Outgoing Internal Links Long Questions 1. What are some of the most common issues that could slow down a website? What are it solution? 2. List down Common Internal Link Problems and solutions. 3. Explain Anchor link optimization in detail. B. Multiple Choice Questions 1. Contextual internal links are typically placed _______content of a page 87 CU IDOL SELF LEARNING MATERIAL (SLM)

a. within the main body b. outside the main body c. within the paragraph d. outside the paragraph 2. Quality links are one of the most important _______ of a healthy SEO strategy a. link b. ingredients c. class d. tags 3. SEO content is any content that is created to increase search engine ___ and therefore traffic to your website a. Data b. Content c. Rankings d. Speed 4. Relevance is determined by how often you use specific ________ 88 a. word CU IDOL SELF LEARNING MATERIAL (SLM)

b. link c. tag d. keyword 5. PSI stands for a. Page Speed Insights b. Paste Special Insights c. Page Speed Indent d. None of these Answers 1-a, 2-b, 3-c, 4-d, 5-a 7.11 REFERENCES References book  Jerri Ledford, “Search Engine Optimization”, Wiley Publishing Inc.  S.S. Niranga, “Mobile Web Performance Optimization”, PACKT Publishing Textbook references  Danny Dover, “Search Engine Optimization Secrets”, Wiley Publishing Inc.  Bruce Clay, “Search Engine Optimization AA-In-One for Dummies A Wiley Brand”, John Wiley & Sons.  Aaron Matthew Wall, “Search Engine Optimization”, http://www.seobook.com/seo- tools.pdf  H.J. Bernardin, Human Resource Management, Tata McGraw Hill, New Delhi, 2004. Website 89 CU IDOL SELF LEARNING MATERIAL (SLM)

 https://www.umassmed.edu/globalassets/it/web-services/google-analytics/google- analytics-user-guide.pdf  https://static.googleusercontent.com/media/www.google.com/en//grants/education/Go ogle_Analytics_Training.pdf  https://analytics.google.com/analytics/web/#/p287091400/reports/reportinghub  Internet Basics: What is the Internet? (gcfglobal.org)  https://www.semrush.com/blog/black-hat- seo/?kw=&cmp=IN_SRCH_DSA_Blog_Core_BU_EN&label=dsa_pagefeed&Networ k=g&Device=c&utm_content=515715717309&kwid=aud-422673590326:dsa- 1053501806627&cmpid=11773572684&agpid=115407097598&BU=Core&extid=20 3745429736&adpos=&gclid=Cj0KCQjww4OMBhCUARIsAILndv6WsS0Lw7OOGv i5TA6yAGFZNZNRGP44MNM3cnYj70oeQ3LiFot_Hn4aAt4aEALw_wcB 90 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT- 8: GOOGLE SEO STRUCTURE 8.7 Learning Objectives 8.8 Introduction 8.9 Google SEO Guidelines 8.10 Google Page Rank 8.11 Creating Robots file 8.12 Google Webmaster Tools account set up & monitoring 8.13 Summary 8.14 Keywords 8.15 Learning Activity 8.16 Unit End Questions 8.17 References 8.0 LEARNING OBJECTIVES After studying this unit, you will be able to:  Describe the general guidelines of google SEO  Setup page for the visitor to visit in more easy manner.  Explain the quality check guidelines.  Create robots file. 8.1 INTRODUCTION To understand what SEO content means, it’s useful to look at the phrase in two parts: The first part – “SEO” (search engine optimization) – is the process of optimizing your website and content so that it shows up higher in search engine results pages for specific search terms. The second part – “content” – is any information that you publish online that can be indexed by search engines. This includes website content, blog posts, images, graphics, and videos. So, taken as a whole, SEO content is any content that is created to increase search engine rankings and therefore traffic to your website. Search engines display their organic search results according to the relevance and authority of a web page. Relevance is determined by how often you use specific keywords and phrases 91 CU IDOL SELF LEARNING MATERIAL (SLM)

within the content of a web page, and authority is determined by the number of trustworthy backlinks that point to that page. To make your content more SEO-friendly, your content should be organized logically, contain relevant keywords, and be written with your audience in mind. 8.2 GOOGLE SEO GUIDELINES Generalguidelines: Help Google find your pages • Ensure that all pages on the site can be reached by a link from another findable page. Make sure the referring link includes either text or, for images, an alt attribute, that is relevant to the target page. Crawlable links are <a> tags with an href attribute. • Provide a sitemap file with links that point to the important pages on your site. Also provide a page with a human-readable list of links to these pages (sometimes called a site index or site map page). • Limit the number of links on a page to a reasonable number (a few thousand at most). • Make sure that your web server correctly supports the If-Modified-Since HTTP header. This feature directs your web server to tell Google if your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead. • Use the robots.txt file on your web server to manage your crawling budget by preventing crawling of infinite spaces such as search result pages. Keep your robots.txt file up to date. Learn how to manage crawling with the robots.txt file. Test the coverage and syntax of your robots.txt file using the robots.txt Tester. Ways to help Google find your site: • Ask Google to crawl your pages. • Make sure that any sites that should know about your pages are aware your site is online. Help Google understand your pages • Create a useful, information-rich site, and write pages that clearly and accurately describe your content. • Think about the words users would type to find your pages, and make sure that your site actually includes those words within it. • Ensure that your <title> elements and alt attributes are descriptive, specific, and accurate. • Design your site to have a clear conceptual page hierarchy. • Follow our recommended best practices for images, video, and structured data. 92 CU IDOL SELF LEARNING MATERIAL (SLM)

• When using a content management system (for example, Wix or WordPress), make sure that it creates pages and links that search engines can crawl. • To help Google fully understand your site's contents, allow all site assets that would significantly affect page rendering to be crawled: for example, CSS and JavaScript files that affect the understanding of the pages. The Google indexing system renders a web page as the user would see it, including images, CSS, and JavaScript files. To see which page assets that Googlebot cannot crawl, use the URL Inspection tool. To debug directives in your robots.txt file, use the robots.txt Tester tool. • Allow search bots to crawl your site without session IDs or URL parameters that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page. • Make your site's important content visible by default. Google is able to crawl HTML content hidden inside navigational elements such as tabs or expanding sections. However, we consider this content less accessible to users, and recommend that you make your most important information visible in the default page view. • Make a reasonable effort to ensure that advertisement links on your pages do not affect search engine rankings. For example, use robots.txt, rel=\"no follow\", or rel=\"sponsored\" to prevent advertisement links from being followed by a crawler. Help visitors use your pages: • Try to use text instead of images to display important names, content, or links. If you must use images for textual content, use the alt attribute to include a few words of descriptive text. • Ensure that all links go to live web pages. Use valid HTML. • Optimize your page loading times. Fast sites make users happy and improve the overall quality of the web (especially for those users with slow Internet connections). Google recommends that you use tools like Page Speed Insights and Webpagetest.org to test the performance of your page. • Design your site for all device types and sizes, including desktops, tablets, and smartphones. Use the Mobile-Friendly Test to test how well your pages work on mobile devices, and get feedback on what needs to be fixed. • Ensure that your site appears correctly in different browsers. • If possible, secure your site's connections with HTTPS. Encrypting interactions between the user and your website is a good practice for communication on the web. • Ensure that your pages are useful for readers with visual impairments, for example, by testing usability with a screen-reader. 93 CU IDOL SELF LEARNING MATERIAL (SLM)

Quality guidelines: These quality guidelines cover the most common forms of deceptive or manipulative behavior, but Google may respond negatively to other misleading practices not listed here. It's not safe to assume that just because a specific deceptive technique isn't included on this page, Google approves of it. Website owners who spend their energies upholding the spirit of the basic principles will provide a much better user experience and subsequently enjoy better ranking than those who spend their time looking for loopholes they can exploit. If you believe that another site is abusing Google's quality guidelines, please let us know by filing a spam report. Google prefers developing scalable and automated solutions to problems, and will use the report for further improving our spam detection systems. Basic principles: • Make pages primarily for users, not for search engines. • Don't deceive your users. • Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you, or to a Google employee. Another useful test is to ask, \"Does this help my users? Would I do this if search engines didn't exist?\" • Think about what makes your website unique, valuable, or engaging. Make your website stand out from others in your field. Specific guidelines: Avoid the following techniques:  Automatically generated content  Participating in link schemes  Creating pages with little or no original content  Cloaking  Sneaky redirects  Hidden text or links  Doorway pages  Scraped content  Participating in affiliate programs without adding sufficient value  Loading pages with irrelevant keywords  Creating pages with malicious behavior, such as phishing or installing viruses, Trojans, or other badware  Abusing structured data markup  Sending automated queries to Google Follow good practices: 94 CU IDOL SELF LEARNING MATERIAL (SLM)

 Monitoring your site for hacking and removing hacked content as soon as it appears  Preventing and removing user-generated spam on your site  If your site violates one or more of these guidelines, then Google may take manual action against it. Once you have remedied the problem, you can submit your site for reconsideration. 8.3 GOOGLE PAGE RANK It is an algorithm used by Google Search to rank web pages in their search engine results. It is named after both the term \"web page\" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. According to Google: PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites. Currently, PageRank is not the only algorithm used by Google to order search results, but it is the first algorithm that was used by the company, and it is the best known. PageRanks is a link analysis algorithm and it assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of \"measuring\" its relative importance within the set. The algorithm may be applied to any collection of entities with reciprocal quotations and references. The numerical weight that it assigns to any given element E is referred to as the PageRank of E and denoted by A PageRank results from a mathematical algorithm based on the web graph, created by all World Wide Web pages as nodes and hyperlinks as edges, taking into consideration authority hubs such as cnn.com or mayoclinic.org. The rank value indicates an importance of a particular page. A hyperlink to a page counts as a vote of support. The PageRank of a page is defined recursively and depends on the number and PageRank metric of all pages that link to it (\"incoming links\"). A page that is linked to by many pages with high PageRank receives a high rank itself. Numerous academic papers concerning PageRank have been published since Page and Brin's original paper.[5] In practice, the PageRank concept may be vulnerable to manipulation. Research has been conducted into identifying falsely influenced PageRank rankings. The goal is to find an effective means of ignoring links from documents with falsely influenced PageRank.[6] Other link-based ranking algorithms for Web pages include the HITS algorithm invented by Jon Kleinberg (used by Teoma and now Ask.com), the IBM CLEVER project, the Trust Rank algorithm and the Hummingbird algorithm 95 CU IDOL SELF LEARNING MATERIAL (SLM)

8.4 CREATING ROBOTS FILE You can control which files crawlers may access on your site with a robots.txt file. A robots.txt file lives at the root of your site. So, for site www.example.com, the robots.txt file lives at www.example.com/robots.txt. robots.txt is a plain text file that follows the Robots Exclusion Standard. A robots.txt file consists of one or more rules. Each rule blocks or allows access for a given crawler to a specified file path in that website. Unless you specify otherwise in your robots.txt file, all files are implicitly allowed for crawling. Here is a simple robots.txt file with two rules: User-agent: Googlebot Disallow: /nogooglebot/ User-agent: * Allow: / Sitemap: http://www.example.com/sitemap.xml Here's what that robots.txt file means: 1. The user agent named Googlebot is not allowed to crawl any URL that starts with http://example.com/nogooglebot/. 2. All other user agents are allowed to crawl the entire site. This could have been omitted and the result would be the same; the default behavior is that user agents are allowed to crawl the entire site. 3. The site's sitemap file is located at http://www.example.com/sitemap.xml. Basic guidelines for creating a robots.txt file Creating a robots.txt file and making it generally accessible and useful involves four steps: 1. Create a file named robots.txt. 2. Add rules to the robots.txt file. 3. Upload the robots.txt file to your site. 4. Test the robots.txt file. 5. Here are some common useful robots.txt rules: Disallow crawling of Keep in mind that in some situations URLs from the website may still the entire website be indexed, even if they haven't been crawled. This does not match the various AdsBot crawlers, which must be named explicitly. User-agent: * 96 CU IDOL SELF LEARNING MATERIAL (SLM)

Disallow: / Disallow crawling of Append a forward slash to the directory name to disallow crawling of a a directory and its whole directory. contents Remember, don't use robots.txt to block access to private content; use proper authentication instead. URLs disallowed by the robots.txt file might still be indexed without being crawled, and the robots.txt file can be viewed by anyone, potentially disclosing the location of your private content. User-agent: * Disallow: /calendar/ Disallow: /junk/ Allow access to a Only googlebot-news may crawl the whole site. single crawler User-agent: Googlebot-news Allow: / User-agent: * Disallow: / Allow access to all Unnecessarybot may not crawl the site, all other bots may. but a single crawler User-agent: Unnecessarybot Disallow: / User-agent: * Allow: / Disallow crawling of For example, disallow the useless_file.html page. a single web page User-agent: * Disallow: /useless_file.html 97 CU IDOL SELF LEARNING MATERIAL (SLM)

Block a specific For example, disallow the dogs.jpg image. image from Google User-agent: Googlebot-Image Images Disallow: /images/dogs.jpg Block all images on Google can't index images and videos without crawling them. your site from User-agent: Googlebot-Image Google Images Disallow: / Disallow crawling of For example, disallow for crawling all .gif files. files of a specific file User-agent: Googlebot type Disallow: /*.gif$ Disallow crawling of This implementation hides your pages from search results, but an entire site, but the Mediapartners-Google web crawler can still analyze them to decide allow Mediapartners- what ads to show visitors on your site. Google User-agent: * Disallow: / User-agent: Mediapartners-Google Allow: / Use $ to match For example, disallow all .xls files. URLs that end with a User-agent: Googlebot specific string Disallow: /*.xls$ Table 8.1 98 CU IDOL SELF LEARNING MATERIAL (SLM)

Figure 8.1 8.5GOOGLE WEBMASTER TOOLS ACCOUNT SETUP & MONITORING Set up a Google account If you don’t already have a Google account, you’ll need to sign up for one. Just go to Google.com and click on ‘You’. Then sign up for an account. This will give you a Google Gmail address as well as access to other Google applications. Next, set up a Webmaster Tools account To set up Webmaster Tools, visit http://www.google.com/webmasters/tools and sign in with your Google account - preferably the same one you use for GA. Click ‘Add a Site’. And enter your site’s URL. Next, you’ll need to verify that the site is yours. You'll be given a number of options, but the easiest way is to verify via your GA account. For this to happen smoothly, you’ll need to be an administrator of the GA account (if you’ve only got ‘User’ permissions; ask the site’s Administrator to upgrade you). 99 CU IDOL SELF LEARNING MATERIAL (SLM)

That's it. Your account should be authorized and ready to use. Though, when you’re setting up Webmaster Tools for the first time, it may take a few days for data to appear. Add a sitemap Once your site is verified, you should create and submit a sitemap via the WMT interface. A sitemap is a simple file that will tell Google what pages you have on your website. You can find out more about sitemaps here. If you are running a WordPress website, it’s worth installing the XML sitemaps plugin which will build an XML sitemap for you. There's loads more to learn about Webmaster Tools, and our friends at KISS metrics have produced a useful guide And, if you haven't already done so, it's worth checking out Word tracker’s guide to setting up Google Analytics. 8.6 SUMMARY  Make sure that your web server correctly supports the If-Modified-Since HTTP header. This feature directs your web server to tell Google if your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.  To help Google fully understand your site's contents, allow all site assets that would significantly affect page rendering to be crawled: for example, CSS and JavaScript files that affect the understanding of the pages. The Google indexing system renders a web page as the user would see it, including images, CSS, and JavaScript files. To see which page assets that Googlebot cannot crawl, use the URL Inspection tool. To debug directives in your robots.txt file, use the robots.txt Tester tool.  Design your site for all device types and sizes, including desktops, tablets, and smartphones. Use the Mobile-Friendly Test to test how well your pages work on mobile devices, and get feedback on what needs to be fixed.  If you believe that another site is abusing Google's quality guidelines, please let us know by filing a spam report. Google prefers developing scalable and automated solutions to problems, and will use the report for further improving our spam detection systems.  Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you, or to a Google employee. Another useful test is to ask, \"Does this help my users? Would I do this if search engines didn't exist?\" 100 CU IDOL SELF LEARNING MATERIAL (SLM)


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook