My web site now exists in its complete trilingual version at the address indicated in my signature.
I am currently developping an english version that can be found by adding the /en/ extension to my site's regular URL. In this process, I am copying content from the original site to the future site. For now I have a couple of lines in the robot.txt file of my main site that forbids search engines from visiting mysitename/en/ extension.
My question has to do with the search engine penalty for having duplicate contents on a same web site. What should I use as a guide to help me decide when to start forbiding search engines to access the individual English language pages of my original site, and when to allow them instead to visit the English pages with the same contents on my new site?
My new site has English Language URL, My old site has French URLs with the "_en" extension. For example, my old site would have html extension pages with core names like "apprendre_en"
(i.e. www . savoiretcroire / apprendre_en . html),
where my new site would have the same page named "learning" followed by html extension
(i.e. www . savoiretcroire . ca / en / learning . html).
Anyone has any comments or insight to share about this?