Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.


Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Generic User Avatar

Hide a website from search engines?


  • Please log in to reply
4 replies to this topic

#1 hedera

hedera

  •  Avatar image
  • Members
  • 138 posts
  • OFFLINE
  •  
  • Local time:09:51 PM

Posted 03 October 2023 - 04:40 PM

I sing in a local chorus, which since at least 2014 has used a Blogger site to post weekly links to the recordings we make of our rehearsals.  The site is for later reference by singers and for people who miss rehearsal.  Most of the music we perform is classical; we occasionally do some pop in a concert around Christmas. I'm not listing the name of the chorus or of the Blogger site, because I've become a little paranoid (see next paragraph).  If you really need names, private message me.

 

I recently horrified our new director by telling him that if he wants to give a rehearsal recording of our current piece (the Stravinsky Symphony of Psalms) to the conductor of our upcoming concert, he should just give her the link to the Blogger site.  The fact that the site was open to search engines led him to fear (apparently this has happened) that we'd be vulnerable to certain companies who search the internet looking for arts organizations to sue, for the crime of recording copyrighted music.  I didn't catch any names of the evil companies and I wasn't aware of this.  Actually, when we want people to know about our site, we usually just give them the URL.  I find that the Stravinsky piece is probably not in the public domain in the U.S., it was written in 1930, 2 years later than 1928, which means automatic public domain in the U.S.

 

I immediately went into the settings of our site, discovered that visible to search engines was ON, and immediately changed it to OFF.  I checked the rest of the settings.  the meta tag to enable search description is off, and there's no description. Crawlers and robot indexing are turned off, as are custom ads.  We did set up the site with FeedBurner some time ago but I don't know that anyone's ever used it, and the blog feed is active.

 

My goal is to have a site which can only be reached by the URL and not by a search engine.  Have I done enough?  What else do I need to do?  Some Googling suggests I might have done this by playing with the robot index files, but I've never worked with them. 

 

Is what I want to do even possible?  After I changed the settings, I found that certain search terms (in Google, I didn't try Bing) would bring up my site.  I assume this is because of caching on Internet routers?  Does anyone know how long those usually take to clear out older cache entries?  

 

I could be asked to move the whole process to Chorus Connection (which the chorus subscribes to), where the recordings would only be available to people with a Chorus Connection login (which should be members and staff of the chorus); this is doable but would take a lot of work and documentation.  I would be very grateful for advice.



BC AdBot (Login to Remove)

 


#2 Pkshadow

Pkshadow

  •  Avatar image
  • BC Advisor
  • 12,972 posts
  • OFFLINE
  •  
  • Gender:Not Telling
  • Location:On the Brow of the Hill, West Coast, Canada
  • Local time:10:51 PM

Posted 04 October 2023 - 01:36 AM

Hi, would be or should be good enough.    As you have said the blogger thing is the difficult part as can get in I suppose.

 

So : <meta name="robots" content="noindex">

 

See below Important: For the noindex rule to be effective, the page or resource must not be blocked by a robots.txt file, and it has to be otherwise accessible to the crawler. If the page is blocked by a robots.txt file or the crawler can't access the page, the crawler will never see the noindex rule, and the page can still appear in search results, for example if other pages link to it.

 

A   .txt file they are looking for is this : https://developers.google.com/search/docs/crawling-indexing/robots/robots-faq     You will find this whole site of interest.


" mosquitoes really wake up everyday and choose violence "   — dalia (@_dalia7)
www.cnn.com/2020/07/23/health/mosquitoes-attraction-humans-future-wellness-scn/index.html
 

I-7 ASUS ROG Rampage II Extreme  / ASUS TUF Gaming F17 / I-7 4770K ASUS ROG Maximus VI Extreme


#3 mnathanm

mnathanm

  •  Avatar image
  • Members
  • 3 posts
  • OFFLINE
  •  
  • Gender:Male
  • Local time:11:51 AM

Posted 04 October 2023 - 08:40 AM

If you are concerned about your site showing up in search results, you can ask the chorus to move the recordings to Chorus Connection. This would ensure that the recordings are only available to people with a Chorus Connection login. However, as you mentioned, this would take a lot of work and documentation.



#4 hedera

hedera
  • Topic Starter

  •  Avatar image
  • Members
  • 138 posts
  • OFFLINE
  •  
  • Local time:09:51 PM

Posted 04 October 2023 - 07:29 PM

Thanks for the suggestions about the robots.txt file, and the link. I'll look into that.  

 

I should have made clear, the recordings are not on the Blogger site.  The recordings are stored at RackSpace, where there is no outside access except for people who are allowed to use the chorus shared password.  (Yes, I know!)  All that's on the Blogger site are links to the recordings stored on RackSpace, along with descriptions of what we went over in a given session.  So if we chose to move to Chorus Connection, the links to the recordings would still work.



#5 Pkshadow

Pkshadow

  •  Avatar image
  • BC Advisor
  • 12,972 posts
  • OFFLINE
  •  
  • Gender:Not Telling
  • Location:On the Brow of the Hill, West Coast, Canada
  • Local time:10:51 PM

Posted 04 October 2023 - 08:00 PM

Just stick this into the HEAD :   <meta name="robots" content="noindex">     of all pages you do not want indexed.


" mosquitoes really wake up everyday and choose violence "   — dalia (@_dalia7)
www.cnn.com/2020/07/23/health/mosquitoes-attraction-humans-future-wellness-scn/index.html
 

I-7 ASUS ROG Rampage II Extreme  / ASUS TUF Gaming F17 / I-7 4770K ASUS ROG Maximus VI Extreme





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users