Reference no: EM132251295
Assignment -
Overview - This assignment focuses on context-free grammars (CFGs). CFGs are used in modelling the syntax of programming languages. Compilers for programming languages will embody a parser, i.e., an algorithm to determine whether a given string belongs to the language generated by the CFG. If the string does belong, then the parser will produce a derivation of the string, which the compiler would then use to translate the string into assembly language. Your overall aim in this coursework is to implement a parser for CFGs. Specifically, you are required to:
1. Devise an encoding (i.e., a data structure) for CFGs that can be used to specify any given CFG in a .txt file, which can then be passed to and processed by the programs in parts 2,3 below.
2. Write a program that can take any given CFG as input (specified as a separate.txt file as in part1) and output an equivalent CFG in Chomsky normal form.
3. Extend the program in part 2 so that, after the conversion to Chomsky normal form, it asks the user for an input string and then determines whether or not the string can be generated by the given CFG. If it can be generated, then it should output a derivation of the string.
Note - More details are in attached file.
Parsing for Context-free Grammars
Task 1 - Devise an encoding, i.e., a data structure, for CFGs that can be processed by the programs in tasks 2 and 3 below. The idea is that it should be possible to specify any given CFG in a separate .txt file using this encoding, which can then be fed as an argument via the command line to the programs below. The encoding can be of your choice, but you must effectively incorporate all four ingredients in the definition of CFG.
Describe and explain your encoding, and illustrate it by giving a screen-shot of your encoding of the example CFG G0 given below. In addition, you must submit a .txt file containing its encoding.
Task 2 - Write a program that can take as input any .txt file containing an encoding of any CFG (using the data structure you provided in task 1) and convert it into Chomsky normal form (see lecture 11). You may use the algorithm for this purpose that was outlined in lecture 11. The program must be executable via the command line, with the .txt file encoding passed as an argument. After execution the program should print to the screen a vertical list of rules in Chomsky normal form, that is, a list of rules with each one taking the form of either
A -> BC
or
A -> x
or
S -> e
where A, B, C are any variables of the given CFG (with B,C not being the start variable), x is any terminal of the CFG, S is the start variable of the CFG and e is the character denoting the empty string.
Using your program, or otherwise, give the CFG G0 given above in Chomsky normal form. (Screenshot acceptable if done using the program). Your code will be tested on the .txt ?le encoding of G0 that you provide in Task 1. In addition it will be tested on two other example CFGs G1 and G2 of my choosing (which will be the same for all students).
Task 3 - Extend your program from Task 2 so that, after conversion to Chomsky normal form, it prompts the user for a string, and then (using the algorithm sketched below) determines whether or not that string is generated by the (Chomsky normal form of the) given CFG. If it isn't then the program should just print something like "string XXX is not generated by this CFG". If it is generated, then it should print to the screen a derivation of that string using the Chomsky normal form of the given CFG. The derivation may be printed in the form of a vertical list of strings, ending in the derived string, and such that each string in the list is derived from the previous one using the rules of the converted CFG.
Attachment:- Assignment File.rar