Macro: Identify Language of Paragraph
Thread poster: Hans Lenting
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Jan 12, 2023

Paul Edstein wrote this macro to assign English language to paragraphs of a German document.

The languages and the list with identifying words can be modified and enhanced by the user.

Macro:

Sub IdentifyEnglishParagraphs()
Application.ScreenUpdating = False
Dim strWords As String, i As Long
strWords = strWords & "whether,question"
... See more
Paul Edstein wrote this macro to assign English language to paragraphs of a German document.

The languages and the list with identifying words can be modified and enhanced by the user.

Macro:

Sub IdentifyEnglishParagraphs()
Application.ScreenUpdating = False
Dim strWords As String, i As Long
strWords = strWords & "whether,question"

With ActiveDocument
With .Range
.LanguageID = wdGerman
With .Find
.Replacement.ClearFormatting
.Replacement.Text = "^&"
.LanguageID = wdGerman
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = True
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = True
End With
For i = 0 To UBound(Split(strWords, ","))
.Find.Text = Split(strWords, ",")(i)
Do While .Find.Execute
.Paragraphs.First.Range.LanguageID = wdEnglishUK
.Collapse wdCollapseEnd
Loop
Next i
End With
End With
Application.ScreenUpdating = False
End Sub


Demo:
1
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
New approach Jan 12, 2023

Since strWords can contain a lot of words (e.g. the 500 most frequent English words that don't match with German words), I think that a better approach will be like this:

  1. Open the document with the frequent/typical words and copy the content to the clipboard.
  2. Cycle through the paragraphs of the document to spell check and split them in words.
  3. Search for every word in the paragraph in the clipboard (it will probably be necessary to assign word delimiter... See more
Since strWords can contain a lot of words (e.g. the 500 most frequent English words that don't match with German words), I think that a better approach will be like this:

  1. Open the document with the frequent/typical words and copy the content to the clipboard.
  2. Cycle through the paragraphs of the document to spell check and split them in words.
  3. Search for every word in the paragraph in the clipboard (it will probably be necessary to assign word delimiters).
  4. As soon as one word of the paragraph matches the content of the clipboard, assign the English language to that paragraph and move on to the next paragraph.


This should be much faster.

Feel free to implement this.

Words of the second paragraph of this posting:

$and$
$clipboard$
$content$
$copy$
$document$
$frequent$
$open$
$the$
$to$
$typical$
$with$
$words$


Test words:

$a$ $about$ $act$ $actually$ $add$ $after$ $again$ $against$ $age$ $ago$ $air$ $all$ $also$ $always$ $am$ $among$ $an$ $and$ $animal$ $another$ $answer$ $appear$ $are$ $area$ $as$ $ask$ $at$ $back$ $ball$ $base$ $be$ $beauty$ $because$ $become$ $bed$ $been$ $before$ $began$ $begin$ $behind$ $best$ $better$ $better$ $between$ $big$ $bird$ $black$ $blue$ $boat$ $body$ $book$ $both$ $bottom$ $box$ $boy$ $bring$ $brought$ $build$ $built$ $busy$ $but$ $by$ $call$ $came$ $can$ $car$ $care$ $carefully$ $carry$ $centre$ $certain$ $change$ $check$ $child$ $children$ $city$ $class$ $clear$ $close$ $cold$ $colour$ $come$ $common$ $community$ $complete$ $contain$ $could$ $country$ $course$ $create$ $cried$ $cross$ $cry$ $cut$ $dark$ $day$ $decide$ $decided$ $deep$ $develop$ $did$ $didn’t$ $different$ $do$ $does$ $dog$ $don’t$ $door$ $down$ $draw$ $dream$ $drive$ $dry$ $during$ $each$ $early$ $earth$ $east$ $easy$ $eat$ $effort$ $enough$ $every$ $example$ $experience$ $explain$ $eye$ $face$ $fact$ $false$ $family$ $far$ $farm$ $fast$ $father$ $feel$ $feet$ $few$ $field$ $find$ $fire$ $first$ $fish$ $five$ $fly$ $follow$ $food$ $form$ $found$ $four$ $friend$ $from$ $front$ $full$ $game$ $gave$ $get$ $girl$ $give$ $go$ $gold$ $good$ $got$ $government$ $great$ $green$ $ground$ $group$ $grow$ $guy$ $had$ $half$ $hand$ $happen$ $happened$ $hard$ $has$ $have$ $he$ $hear$ $heat$ $heavy$ $help$ $her$ $here$ $high$ $his$ $hold$ $home$ $horse$ $hot$ $hour$ $house$ $hundred$ $idea$ $if$ $important$ $in$ $inch$ $include$ $into$ $is$ $island$ $it$ $just$ $keep$ $kind$ $king$ $knew$ $know$ $known$ $land$ $language$ $large$ $last$ $late$ $later$ $laugh$ $lead$ $learn$ $leave$ $left$ $less$ $less$ $let$ $letter$ $life$ $light$ $like$ $line$ $list$ $listen$ $little$ $live$ $long$ $look$ $love$ $low$ $machine$ $made$ $make$ $man$ $many$ $map$ $mark$ $may$ $mean$ $measure$ $men$ $might$ $mile$ $million$ $mind$ $minute$ $miss$ $money$ $month$ $moon$ $more$ $more$ $morning$ $most$ $mother$ $mountain$ $move$ $much$ $music$ $must$ $my$ $name$ $nation$ $near$ $need$ $never$ $new$ $next$ $night$ $no$ $north$ $note$ $notice$ $noun$ $now$ $number$ $object$ $of$ $off$ $office$ $often$ $oh$ $oil$ $old$ $on$ $once$ $one$ $only$ $open$ $or$ $order$ $other$ $our$ $out$ $over$ $page$ $pair$ $part$ $pass$ $passed$ $people$ $perhaps$ $person$ $picture$ $place$ $plan$ $plane$ $plant$ $play$ $point$ $power$ $probably$ $problem$ $product$ $provide$ $pull$ $put$ $question$ $quick$ $rain$ $ran$ $reach$ $read$ $ready$ $real$ $receive$ $record$ $red$ $relationship$ $remember$ $right$ $river$ $road$ $rock$ $room$ $round$ $rule$ $run$ $said$ $same$ $saw$ $say$ $school$ $science$ $sea$ $season$ $second$ $see$ $seem$ $self$ $sentence$ $serve$ $set$ $several$ $shape$ $she$ $ship$ $short$ $should$ $show$ $shown$ $side$ $simple$ $since$ $sing$ $sit$ $six$ $size$ $sleep$ $slow$ $small$ $snow$ $so$ $some$ $something$ $song$ $soon$ $sound$ $south$ $space$ $special$ $spell$ $spring$ $stand$ $star$ $start$ $stay$ $step$ $stood$ $stop$ $story$ $street$ $strong$ $study$ $such$ $summer$ $sun$ $system$ $table$ $take$ $talk$ $teach$ $tell$ $ten$ $test$ $than$ $that$ $the$ $their$ $them$ $then$ $there$ $these$ $they$ $thing$ $think$ $this$ $those$ $though$ $thought$ $thousand$ $three$ $through$ $time$ $to$ $together$ $told$ $too$ $took$ $top$ $toward$ $town$ $travel$ $tree$ $true$ $try$ $turn$ $two$ $under$ $understand$ $until$ $up$ $upon$ $us$ $use$ $usual$ $very$ $voice$ $vowel$ $wait$ $walk$ $want$ $war$ $warm$ $was$ $watch$ $water$ $wave$ $way$ $we$ $week$ $weight$ $were$ $west$ $what$ $wheel$ $where$ $which$ $white$ $who$ $why$ $will$ $wind$ $winter$ $with$ $without$ $woman$ $wonder$ $wood$ $word$ $words$ $work$ $world$ $would$ $write$ $wrong$ $year$ $yes$ $you$ $young$


German words like "war" (amazing that this word has such a high frequency in English), "in", "land" have to be removed from the list:


a
after
all
also
am
an
ball
box
bring
fast
find
form
front
half
happen
hold
in
just
listen
man
oh
open
pass
plan
plane
plant
ran
real
same
sing
so
spring
stand
such
top
turn
war
warm
was
west
will
wind



[Edited at 2023-01-12 18:16 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Macro: Identify Language of Paragraph






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »