jagomart
digital resources
picture1_Language Pdf 99106 | Ijsrcsamsv8i1p108


 118x       Filetype PDF       File size 0.60 MB       Source: www.ijsrcsams.com


File: Language Pdf 99106 | Ijsrcsamsv8i1p108
issn 2319 1953 international journal of scientific research in computer science applications and management studies an efficient english to hindi translator 1 2 3 4 dhawal jain aditi jadhav ateeq ...

icon picture PDF Filetype PDF | Posted on 21 Sep 2022 | 3 years ago
Partial capture of text on file.
                                                                                                                                     ISSN 2319 – 1953 
                            International Journal of Scientific Research in Computer Science Applications and Management Studies 
             
                             An Efficient English To Hindi Translator 
                                                                  1                  2                  3               4 
                                                  Dhawal Jain , Aditi Jadhav , Ateeq Ansari , Aditi Raut
              1,2,3,4(Department of Computer Engineering, St. John College Of Engineering & Management, Palghar, Maharashtra, India) 
                             1 jaindhawal05@gmail.com,2 jadhavadi25@gmail.com,3ateeqnsr8@gmail.com,4 aditir@sjcet.co.in 
                                                                                       
             
            Abstract— Machine Translation pertains to a translation of one             can  translate  English  into  several  regional  languages.  Also, 
            natural language to another by using automated computing. The              several websites are in English, which are of no use to rural 
            primary objective is to fill the language gap between two different        people,  as  they  do  not  know  English,  thus  are  unable  to 
            languages speaking people, communities or countries. India is a            understand the information given on the site. Hence a translator 
            multilingual  country;  different  states  have  different  territorial    is needed which can convert English to Hindi which can be 
            languages,  but  not  all  Indians  are  polyglots.  There  are  18        easily understood by the people. 
            constitutional languages and ten prominent scripts. The majority 
            of the Indians, especially the remote villagers, do not understand,                              II.  LITERATURE REVIEW 
            read  or  write  English,  therefore  implementing  an  efficient             The paper focuses on rule-based machine translation. It is 
            language translator is needed. Machine translation systems that 
            translate  text  from  one  language  to  another  will  enhance  the      based on corpus management and multilingual database. The 
            enlightened  society  of  Indians  without  any  language  barrier.        system architecture comprises of the parser and morphological 
            English, being a universal language and Hindi, the language used           tools  which analyses grammar of source language and then 
            by  the  majority  of  Indians,  we  propose  an  English  to  Hindi       transform it into the target language. The method suggested in 
            machine translation  system  design  based  on  Recurrent  neural          the paper [1] requires a deep understanding of the grammatical 
            network(RNN), LSTM(Long short-term memory) and attention                   structure of both source and target language. 
            mechanism.                                                                    Statistical machine translation is done using statistics. The 
                                                                                       idea  behind  this  comes  from  information  theory.  The 
            Keywords— RNN, LSTM, Attention mechanism.                                  translation Is done according to the probability distribution. 
                                     I.  INTRODUCTION                                  The method suggested in the paper [2] uses Bayes decision rule 
               Machine translation has been in the process of development              and  statistical  theory  to  minimize  errors.  The  approach 
            since  1940.  Machine  translation  has  been  in  the  process  of        discussed in this paper has a word alignment problem between 
            growth since 1940. Machine Translation system translates text              phrases and language modeling problem. 
            or  speech  from  one  natural  language  to  another  language.              [3] Hybrid mechanism, i.e., a combination of rule-based and 
            Machine translation is needed to convert the document or text              statistical based machine translation is used for conversion. The 
            to our native language from other commonly known languages.                architecture comprises of the splitter, parser, declension tagger, 
            It overcomes the lingual barriers. NLP is the field of CS that             sentence rules, reordering, lexical dictionary, and translator. In 
            strives  to  fill  this  gap.  Neural  Machine  Translation  requires      this  paper, the source language is passed through splitter in 
            minimum domain knowledge and is conceptually simple. A                     which sentence is divided into words, and then parser analyses 
            vast neural network is trained and can generate very long word             the syntax and semantic structure. Declension tagger inflects 
            sequences. The model does explicitly store large phrase tables             noun,  adjective,  pronoun  to  indicate  singular,  plural,  case, 
            and  language  models,  unlike  standard  machine  translation             gender. Then the reordering is done and using lexical rule the 
            system. The first successful demonstration of the MT system is             source language is translated into a target language. 
            done by the collaboration of Georgetown University and IBM                    The paper [4] is based on the neural machine translation. 
            in the year 1965. The importance of Machine Translation arises             Architecture  discussed  in  this  paper  comprises  the  encoder, 
            from  the  socio-political  significance  of  translation  in              decoder, residual connection, etc. This approach is based on 
            communities where more than one language is spoken. Besides,               modeling the conditional probability of translating a  source 
            the concept of attention mechanism is used.                                sentence to the target sentence. This approach provides a more 
               Hindi is a widely spoken language as well as the principal              accurate translation 
            official  language  of  India,  whereas  English  is  spoken                                        III. METHODOLOGY 
            worldwide, hence is an internationally well-known language. 
            From the British  period,  English  as  a  verbal  language  was           A. Architecture Diagram: 
            introduced  in  India.  Thus,  both  English  &  Hindi  are  major            The System consist of the following modules: 
            languages, both primarily used. Thus, there is a need to build a              1.   Encoder-Decoder Model 
            translator for converting one to another. Here we are going to                2.   LSTM 
            study  English  to  Hindi  translation.  Presently  awareness  has            3.   Attention Mechanism 
            been developed in India to use regional languages like Hindi                   
            for government document writing and other purposes. In this 
            context, it has become essential to creating an MT system that 
                                                                            IJSRCSAMS 
                     Volume 8, Issue 1 (January 2019)                                                           www.ijsrcsams.com 
             
                                                                                                                            ISSN 2319 – 1953 
                          International Journal of Scientific Research in Computer Science Applications and Management Studies 
            
                                                                                 architecture  comprises  of  a  memory  cell,  an  input  gate,  an 
                                                                                 output gate, and a forget gate.  
                                                                                 Input Gate: The input gate is responsible for the addition of 
                                                                                 information to the cell state. 
                                                                                 Forget  Gate:  A  forget  gate  is  responsible  for  removing 
                                                                                 information from the cell state. 
                                                                                 Output Gate: Produces the output. 
                                                                                  
                                                                     
                                Fig. 1. Architecture Diagram 
           B. Encoder-Decoder Model: 
              IT is a way of organizing recurrent neural networks(RNN) 
           to  tackle  sequence-to-sequence  projection  issue  where  the 
           count of input and output time steps differ. The model was build 
           for  the  matter  of  machine  translation,  such  as  translating                                                                   
           sentences in English to Hindi.  
              The model involves two sub-models, as follows:                                        Fig. 3. Long Short Term Memory 
           Encoder: Encoder is an RNN model that reads the entire source         D. Attention Mechanism: 
           sequence to a fixed-length encoding. 
           Decoder: Decoder is an RNN model that uses the encoded input             The  encoder-decoder  model  is  an  end-to-end  model  that 
           sequence and decodes it to output the target sequence.                performs well on challenging sequence-to-sequence prediction 
           The figure shows the relationship between the encoder and the         problems such as machine translation. The model appears to be 
           decoder models.                                                       limited on very long sequences. The reason for this is the fixed-
                                                                                 length  encoding  of  the  source  sequence.  Attention  is  a 
                                                                                 mechanism  that  provides  a  first  encoding  of  the  source 
                                                                                 sequence from which to build up a context vector which can 
                                                                                 then be used by the decoder. Attention mechanism allows the 
                                                                                 model to learn what encoded words in the source sentence pay 
                                                                                 attention to and to what degree during the forecast of each word 
                                                                                 in  the  target  sentence.  The  hidden  state  for  each  input  is 
                                                                                 assembled from encoder rather than the hidden state of the final 
                               Fig. 2.  Encoder-Decoder Model                    step  of  the  source  sequence.  A  context  vector  is  build  up 
              The LSTM recurrent neural network is used as the encoder           especially for each output word in the target sentence. First, 
           and decoder. The encoder output describes the source sequence,        each hidden state value from the encoder is attained using a 
           which is used to begin the converting process, trained on the         neural network, and then it is normalized to a probability over 
           words already produced as output so far. The hidden state of an       the encoder's hidden states. Finally, the possibilities are used to 
           encoder for the final time step of the input is used to start the     determine  a  weighted  sum  of  the  encoder-hidden  states  to 
           state of the decoder.                                                 produce a context vector to be used in the decoder. 
           C. Long Short-term Memory: 
              Long short-term memory units are units of a recurrent neural 
           network. An RNN composed of LSTM units is often called an 
           LSTM network. The cell remembers values over arbitrary time 
           intervals, and the three gates regulate the flow of information 
           into and out of the cell. There are several architectures of LSTM 
           units. An LSTM cell takes input and stores it for some time, it 
           is equivalent to applying the identity function is constant, when 
           an  LSTM network is trained  with  backpropagation  through 
           time, the gradient does not vanish. The activation function of                                                                  
           the  LSTM  gate  is  often  the  logistic  function.  A  typical                           Fig. 4. Attention Mechanism 
                                                                       IJSRCSAMS 
                    Volume 8, Issue 1 (January 2019)                                                           www.ijsrcsams.com 
            
                                                                                                                    ISSN 2319 – 1953 
                         International Journal of Scientific Research in Computer Science Applications and Management Studies 
            
           D. Implementation:                                                  Other results are shown in the table below. 
             The project is based on the conversion of English text to a         Sr.        Input(English)            Output(Hindi) 
           Hindi version. Input can be an English document or a text file,      No. 
           and after processing, we get the output as a Hindi text.            1        You're kidding!           मज़ाक कर रह े हो! 
           Training Phase:                                                     2        Is there a cafe? 
             In this phase, we have trained English- Hindi bilingual data                                         यह़ााँ कै फे  है क्य़ा? 
           with an epoch= 300. The training data includes both English as 
           well as its corresponding Hindi sentence and words.                 3        Come if you can.          अंदर आ ज़ाओ। 
           Testing: 
             In the testing phase, we tested various inputs, which were in     4        Make a better 
           the form of pdf, doc, etc. After training the data with an epoch             translation of the        आप जजस व़ाक्य  क़ा 
           =300, we have achieved the accuracy of  90 to 95%. Most of                   sentence that you are     अनुव़ाद कर रह े हैं, उस 
           the input sentences are yielding a correct output.                           translating. Do not let 
                                                                                        translations into other   ही क़ा अच्छी तरह से 
                                    IV. RESULTS                                         languages influence       अनुव़ाद  करें।  दसू री 
             We successfully tested our proposed framework with more                    you.                      भ़ाष़ाओं के  अनुव़ादों से 
           than twenty individual sentences having a different perspective.                                       प्रभ़ाजवत न होने द।ें  
           Following some examples illustrates the output for the given                                     
           input:                                                              Graph in Fig 6. Shows the accuracy of the implemented 
                                                                            system. On X- axix we have epoch and on y-axis we have 
                                                                            accuracy. 
                                 Fig 5. Input textbox                                                                                      
                                                                                                  V. CONCLUSIONS 
                                                                               In this paper, we built an English to Hindi translator using 
                                                                            RNN. We experimented with long short-term memory (LSTM) 
                                                                            and attention mechanism. Using the attention mechanism and 
                                                                            LSTM the correct translation  to  a  target  language  is  made 
                                                                            possible. In this project, we have added a feature that we can 
                                                                            directly upload a document that is to be translated so eventually 
                                                                            it reduces the typing time. To make the translation process more 
                                                                            efficient, new rules can be added to the system. 
                                                                                                 ACKNOWLEDGMENT 
                                                                               We thank our guide, Ms. Aditi Raut who has extended all 
                                                                            valuable  guidance  and  help  through  various  stages  for  the 
                                                                            development of the project. Her Valuable suggestions were of 
                                                                            immense help throughout the project work. 
                                                                               We convey our sincere regards to our respected principal Dr. 
                                Fig 6. Output Textbox                       G.V. Mulgund and Head of Department Dr. G.A. Walikar for 
                                                                            their valuable support. 
                                                                  IJSRCSAMS 
                   Volume 8, Issue 1 (January 2019)                                                           www.ijsrcsams.com 
            
                                                                                                                                                         ISSN 2319 – 1953 
                                International Journal of Scientific Research in Computer Science Applications and Management Studies 
               
                                              REFERENCES 
              [1]    Shachi Mall, Umesh Jaiswal. 2013. Developing a system for machine 
                     translation from Hindi to English. In 2013 4th International Conference 
                     on Computer and Communication Technology (ICCCT). 
              [2]    A.  R.  Babhulgaonkar,  S.  V.  Bharad.  2017.  Statistical  Machine 
                                             st
                     Translation. In 2017 1  International Conference on Intelligent System 
                     and  Information  Management  (ICISIM),  October  5-6,  2017, 
                     Aurangabad, India.  
              [3]    Jayshree Nair, Amrutha Krishnan, Deetha R. 2017. An efficient English 
                     to  Hindi  machine  translation  using  a  hybrid  mechanism.  2016  Intl. 
                     Conference  on  Advances  in  Computing,  Communications,  and 
                     Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India.  
              [4]    Karthik Revanuru, Kaushik Turlapaty, and Shrisha Rao. 2017. Neural 
                     Machine Translation of Indian Languages. In Compute ’17:10th Annual 
                     ACM  India  Compute  Conference,  November  16–18-2017,  Bhopal, 
                     India. 
              [5]    Brenda  Reyes  Ayala,  Jiangping  Chen,2017.  A  Machine  Learning 
                     Approach to Evaluating Translation Quality, IEEE 2017. 
              [6]    Hybrid  machine  translation  for  English  to  Marathi:  A  research 
                     evaluation in Machine Translation, March 2016 
              [7]    Kamala Kant Yadav, Dr. Umesh Chandra Jaiswal. A Survey Paper on 
                     Performance  Improvement  of  Word  Alignment  in  English  to  Hindi 
                     Translation  System.  In  2017  International  Conference  on  Intelligent 
                     Computing and Control (I2C2) 
              [8]    Pankaj  Kumar,  Sheetal  Srivastava,  Monica  Joshi.  Syntax  Directed 
                     Translator for English to Hindi Language. In 2015 IEEE International 
                     Conference  on  Research  in  Computational  Intelligence  and 
                     communication Networks. 
              [9]    Brian Sam Thomas, Rajat Dogra, Bhaskar Dixit, Aditi Raut. “Automatic 
                     Image  and  Video  Colourisation  using  Deep  Learning”  2018 
                     International    Conference      on    Smart     City    and     Emerging 
                     Technology(ICSCET), Mumbai, 2018  
                                                                                       IJSRCSAMS 
                        Volume 8, Issue 1 (January 2019)                                                           www.ijsrcsams.com 
               
The words contained in this file might help you see if this file matches what you are looking for:

...Issn international journal of scientific research in computer science applications and management studies an efficient english to hindi translator dhawal jain aditi jadhav ateeq ansari raut department engineering st john college palghar maharashtra india jaindhawal gmail com jadhavadi ateeqnsr aditir sjcet co abstract machine translation pertains a one can translate into several regional languages also natural language another by using automated computing the websites are which no use rural primary objective is fill gap between two different people as they do not know thus unable speaking communities or countries understand information given on site hence multilingual country states have territorial needed convert be but all indians polyglots there easily understood constitutional ten prominent scripts majority especially remote villagers ii literature review read write therefore implementing paper focuses rule based it systems that text from will enhance corpus database enlightened so...

no reviews yet
Please Login to review.