แม่แบบ:R:GNV
{{{1}}} at Google Ngram Viewer
- The following documentation is located at แม่แบบ:R:GNV/documentation. [edit]
- Useful links: subpage list • links • redirects • transclusions • errors (parser/module) • sandbox
Use this template to link to Google Ngram Viewer, showing time-dependent graph of word form or spelling frequencies.
Parameters
แก้ไขThe following parameters are used by this template:
|1=
- The terms or terms to be graphed.
|2=
- A display override for the term or terms.
|corpus=
- The index of the corpus to be shown, see available corpora. Defaults to 26 i.e. English.
|startyear=
,|start=
- The year to begin the graph at. Defaults to 1800.
|endyear=
,|end=
- The year to end the graph at. Defaults to the newest available (see available corpora).
|caseinsensitive=
- Whether to search with case insensitivity on or not. Any value taken to mean yes. Defaults to no.
Examples
แก้ไขHere are some:
* {{R:GNV|indecipherable, undecipherable}}
- indecipherable, undecipherable at Google Ngram Viewer
* {{R:GNV|ad lib, extemporal, extemporary, extemporaneous, extempore, extemporized, impromptu, improvised, improviso, off-the-cuff, offhand|some of the synonyms}}
- some of the synonyms at Google Ngram Viewer
* {{R:GNV|телепрогра́мма, телепереда́ча, телешо́у|corpus=36}}
- телепрогра́мма, телепереда́ча, телешо́у at Google Ngram Viewer
* {{R:GNV|malen, streichen|corpus=31}}
- malen, streichen at Google Ngram Viewer
* {{R:GNV|colour:eng_gb_2019,colour:eng_us_2019}}
- colour:eng_gb_2019,colour:eng_us_2019 at Google Ngram Viewer
* {{R:GNV|croissanterie|corpus=30|start=1900}}
- croissanterie at Google Ngram Viewer
* {{R:GNV|color/colour}}
- color/colour at Google Ngram Viewer
* {{R:GNV|states of *}}
- states of * at Google Ngram Viewer
* {{R:GNV|states of *_NOUN}}
- states of *_NOUN at Google Ngram Viewer
* {{R:GNV|*_ADJ argument}}
- *_ADJ argument at Google Ngram Viewer
* {{R:GNV|cook_NOUN,cook_VERB}}
- cook_NOUN,cook_VERB at Google Ngram Viewer
* {{R:GNV|cook_INF a meal}}
- cook_INF a meal at Google Ngram Viewer
* {{R:GNV|cook_INF *_NOUN}} -- does not work
- cook_INF *_NOUN at Google Ngram Viewer
Available corpora
แก้ไขA list (with descriptions) is also available at https://books.google.com/ngrams/info.
Corpus | 2019 index | 2012 index | 2009 index | Shorthand (followed by _ and year) |
---|---|---|---|---|
American English | 28 | 17 | 5 | eng_us |
British English | 29 | 18 | 6 | eng_gb |
Chinese (simplified) | 34 | 23 | 11 | chi_sim |
English | 26 | 15 | 0 | eng |
English Fiction | 27 | 16 | 4 | eng_fiction |
English One Million | N/A | N/A | 1 | eng_1m |
French | 30 | 19 | 7 | fre |
German | 31 | 20 | 8 | ger |
Hebrew | 35 | 24 | 9 | heb |
Italian | 33 | 22 | N/A | ita |
Russian | 36 | 25 | 12 | rus |
Spanish | 32 | 21 | 10 | spa |
Limitations
แก้ไขGoogle Ngram Viewer suffers from some limitations: 1) scanning errors (scannos); 2) corpus increasingly biased toward academic publications with passage of time; 3) each book has the same weight regardless of popularity; 4) wrong assignment of year of publication. Some of the problems are covered below. The scanno problem does not seem to completely invalidate the results, especially for English and longer words. The severity of the problems depends on what we want to measure, whether cultural change over time or relative frequencies of word forms.
Bias toward academic publication
แก้ไขfigure, Figure at Google Ngram Viewer reveals the problem: capitalized Figure rises to the top during 20th century, suggestive of use in captions of academic literature. When we restrict the corpus to English Fiction, the problem disappears: figure, Figure at Google Ngram Viewer.
Long s vs. f
แก้ไขfuck at Google Ngram Viewer shows the problem: there is no way there were so many instances of "fuck" before 1800; rather, these are likely scannos of "suck" caused by long s (ſ). On the other hand, this problem does not occur after 1820.
Dropping hyphens
แก้ไขanti-American, (antiAmerican*10) at Google Ngram Viewer and แม่แบบ:b.g.c. show the problem: scanning sometimes drops the hyphen. There is no way there are so many occurrences of "antiAmerican" and the Google Books search confirms that. Other examples: (exteacher*10),ex-teacher at Google Ngram Viewer, (nonEnglish*10),non-English at Google Ngram Viewer.
Some hyphens are dropped when used within an unbroken line, other are dropped at a line break, which is ambiguous as for the presence of hyphen.
Dropping spaces
แก้ไขthebook, nonchocolate at Google Ngram Viewer and แม่แบบ:b.g.c. show the problem: the space was dropped and the result is as common as the legitimate nonchocolate. On the other hand, the book,(thebook*5000) at Google Ngram Viewer shows this happens relatively rarely.
Joining different columns
แก้ไขแม่แบบ:b.g.c. shows the scanning problem: there are very few occurrences of "misargument" and some of the found items result from joining parts from different columns in multi-column publications[1]. This one example does not make it into GNV statistics, though. It is unclear this could significantly impact frequencies of common words, though.
Changes in capitalization
แก้ไขThere is no reason to think there are spurious changes in capitalization. anti-American,(antiamerican*1000) at Google Ngram Viewer looks plausible, unlike anti-American, antiAmerican at Google Ngram Viewer.
Links
แก้ไข- The Pitfalls of Using Google Ngram to Study Language, 2015, wired.com
- Guideline for improving the reliability of Google Ngram studies: Evidence from religious terms, 2019
- W:Google Ngram Viewer#Limitations
Hyphens
แก้ไขAs of Oct 2022:
- To search for hyphenated phrases, do one of the following:
- Hope that GNV will continue working like before, e.g. non-standard, nonstandard at Google Ngram Viewer
- Make sure to enter spaces around the hyphen and use [] around the term, e.g. [non - standard,nonstandard] at Google Ngram Viewer.
- Google will often pick hyphenated phrase as non-hyphenated, whether in the middle of the text or at a line break.
- Thus, comparisons like exteacher, ex-teacher at Google Ngram Viewer show results much more favorable to exteacher than reality. One needs to check in Google Books what is actually found on the scanned pages. It still shows convincingly exteacher is rare; it is in fact much rarer.
- anti-American, antiAmerican at Google Ngram Viewer shows too many hits for antiAmerican. A similar result is for anti-German, antiGerman at Google Ngram Viewer.
- You can plot frequency ratio:
{{R:GNV|nonstandard/[non - standard]}}
: nonstandard/[non - standard] at Google Ngram Viewer.
Further reading
แก้ไข- About Google Ngram Viewer, books.google.com - a how to
Google Ngram Viewer ในวิกิพีเดียภาษาอังกฤษวิกิพีเดีย en