banner



how to create your own coding language

  • Download binary - 125.41 KB
  • Download source - 1.71 MB

Table of Contents

  1. What's a Programming Language?
  2. Why We Need another Programming Language
  3. JavaCC
  4. Java Reflection
  5. Eclipse Configuration
  6. Programming Language Example (Name: St4tic)
    • 6.0- Grammar
    • 6.1- Code Generating
    • 6.2- Using Reflection
    • 6.3- Core Creation
    • 6.4- Making Interpreter
  7. System:out:println(1 + var)
  8. Summary
  9. Reference

1- What's a Programming Language?

A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Why? Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human communication, because is hard for humans to type just a numbers like "1001011001..." for creating very large algorithms or programs like your Operating System.

In reality, a programming language is just a vocabulary and set of grammatical rules for instructing a computer to perform specific tasks. The term programming language usually refers to high-level languages, such as C/C++,Perl, Java, and Pascal etc. In theory, each language has a unique set of keywords (words that it understands) and a special syntax for organizing program instructions, but we can create many languages that have the same vocabulary and grammar like "Ruby" and "JRuby" or others.

Regardless of what language we use, we eventually need to convert our program into machine language so that the computer can understand it. There are two ways to do this:

  • Compile the program (like C/C++)
  • Interpret the program (like Perl)

In this article, we use the second way "interpreted language" like Perl or Ruby, called "St4tic" for demonstration.

2- Why We Need Another Programming Language

Really, why do we need another? We have many programming languages as we can see in a Wiki list.

But how do you create your own? Even if you have this idea, you might say, "creating a programming language is impossible for me. I'm not crazy, because it's very hard!" Yes, creating a programing language from scratch is hard. You don't have any libraries or any source code to follow it. Hard like if you set a M.U.G.E.N configuration "Level : hard 8" and "Speed : fast 6".

But now, we have many tools like Yacc, JavaCC, etc. for generating source code for us.

Personally I've created my own programming language called Alef++ [http://alefpp.sourceforge.net/ ] just for fun, and for better understanding: What is a programing language? How does it work? Can I can create my own? What's the difference between my own and others?

It's good reading if you're not discouraged yet!

3- JavaCC

"JavaCC (Java Compiler ) is an open source parser generator for the Java programming language. JavaCC is similar to Yacc in that it generates a parser for a formal grammar provided in EBNF notation, except the output is Java source code. Unlike Yacc, however, JavaCC generates top-down parsers, which limits it to the LL(k) class of grammars (in particular, left recursion cannot be used). The tree builder that accompanies it, JJTree, constructs its trees from the bottom up."

Briefly, JavaCC is a tool for transforming and generating a parser with Java source code (like regular expressions) for checking source code syntax, from rules you've defined as grammar. Don't worry, JavaCC grammar is like Java source code, so you may need to be familiarized with Java.

4- Java Reflection

Java reflection isn't quite accurate, but perhaps "mirror reflection" is closer to the truth. I explain why:

In reality Java or Ruby reflection, .NET reflection is just hacking and breaking into OO-Style (OOP) rules, is like mirror reflection.

  • If in Java we can't access private members and methods in another classes, with Java reflection we can do it easily
  • If we need to use an external library not imported in compiled code "import something.*", we can import it dynamically.
  • If we need to use a class instance not declared in compiled code, we can create a class instance dynamically.
  • Etc.

Now you see why! If you know Java persistence, this can read from a database and return a list of objects, each object is a row in database, but you have only defined a table structure in one class and Java persistence does all the work for you! So you don't have the question, "How does this work?"

I back to our game analogy, now that the team is completed, with our second Player as Java reflection, we just need to choose a battle area and start the fight!

5- Eclipse Configuration

Eclipse, Eclipse and Eclipse... why? If you're lazy like me, creating a text file and writing grammar without any syntax colorization can be discouraging, and people just want it done like a Wizard/Setup - "Next, Next, Finish!"

Okay, let's configure a battle area. If you don't have Eclipse, download it from here.

Next, follow this setup from SourceForge for configuring JavaCC in Eclipse.

6- Programming Language Example ( Name : St4tic )

Ready!? St4tic is very small programing language (nano-programing language) deigned to be easy to understand for beginners, and any one can modify it without much effort, because I have created it just for a demonstration.

St4tic can do just arithmetic operations (+, -, /, *) for integers. Mathematical operations in IN, has two conditions "IF" and "WHILE," importing Java packages, variables declaration, and executes ONLY public static methods such System.out.println Not bad?

Before viewing St4tic grammar, just remember St4tic is an interpreted language like Perl or Python, can read text (source code) from file and parsing it, and create an object tree for interpreting them (executing instructions).

Fight!

Example file text:

require java lang.  "I'm comment  def var = 13.  while var > 0 do       System:out:println( var ).      var = var – 1.  stop

image001.gif

(image from wiki)

6.0- Grammar

Open your big eyes, and follow my steps. Remember how I said I'm lazy, and I preferred using a JTB (Java Tree Builder) to build or generate all the needed source code without much effort? That's what makes this wizard so nice.

First, we create a JTB file. Do you know how? In theory, you have installed JavaCC in your Eclipse by following these steps, so you should be good. If you lost it, that's no problem, you still have 98 credits and can go back and restart.

Okay, now we divide a grammar to three big groups:

  • Options
  • Tokens
  • Rules

If your JDK version don't support templates (generics), try to set in your project Java compilation compatibility 1.5 (Java 5).

Options

options {     JDK_VERSION =          "          1.5";     STATIC = false; }

We use a Java Development Kit 1.5 (also called Java5 JDK_VERSION= "1.5";) for compilation compatibility with Java 5, and also an instance methods for parser (STATIC=false);.

Tokens

SKIP : {          "                      "          |          "          \t"          |          "          \n"          |          "          \r"          |          <          "          \""          (~["          \n","          \r"])*          (          "          \n"|"          \r"|"          \r\n"          )          >          }

For skipping a space between keyword, tab and new lines or returns, but last is for skipping comments, like in Java:

            in St4tic comment          for          one line we use a ("          ) double quot: [Code Block] "Comment here...          TOKEN          :  {          <          REQUERE:          "          require"          >          |          <          IF:          "          if"          >          |          <          WHILE:          "          while"          >          |          <          DO:          "          do"          >          |          <          STOP:          "          stop"          >          |          <          DEF          :          "          def"          >          }

We can assume from an initial glance this a St4tic reserved keyword! St4tic has only six reserved keywords.

"require" keyword used for Java library importation like "import" in Java:

require java lang.

This imports all "java.lang" classes.

The def keyword, is like my in Perl for variable declaration, we can't declare any variables without using def.

def myVar =          1. def num13 =          13.

"if" and "while" are the classical if-condition and while-loop.

          if          1          >          0          do          "          do something … stop  while 1 > 0 do       "repeat in infinite loop … stop          TOKEN          :  {          <          DOT:          "          ."          >          |          <          COLON:          "          :"          >          |          <          EQ:          "          =="          >          |          <          GT:          "          >"          >          |          <          LT:          "          <"          >          |          <          GE:          "          >="          >          |          <          LE:          "          <="          >          |          <          NE:          "          !="          >          |          <          PLUS:          "          +"          >          |          <          MINUS:          "          -"          >          |          <          MUL:          "          *"          >          |          <          DIV:          "          /"          >          |          <          MOD:          "          %"          >          |          <          ASSIGN:          "          ="          >          }

Here, we can grouping symbols to "Math Operation Symbols" (+,-,*,/,%) and "Math Relational Symbols" (>,<,==,>=,<=,!=).

          TOKEN          :  {          <          INTEGER_LITERAL:          ["          1"-"          9"]          (["          0"-"          9"])*          |          "          0"          >          }

Literals, maybe we can say "value" or "data" (in St4tic) example:

def myAge   =          24. def var     =          666.          if          11          >          10          do          … stop  a values          "          24, 666, 11, 10"          is checked or parsed as literals.          TOKEN          :  {          <          IDENTIFIER:          <LETTER>          (          <LETTER>|<DIGIT>          )*          >          |          <          #LETTER:          ["          _","          a"-"          z","          A"-"          Z"]          >          |          <          #DIGIT:          ["          0"-"          9"]          >          }

Identifiers like literals, just identifiers for only variables names "myAge", "var", etc. Now we have completed tokens that are not hard at all =), we just need imagination for founding keywords and symbols, but we can use an existences keyword from other programing languages.

Rules

Here is a big challenge, because we need a new programing language, that has different or revolutionary organization adopted for parsing, hmm... maybe can be hard to understand it if we use hard organization (syntax)? I preferred to use an easy something like Pascal or Visual Basic.

Before starting:

"if rule is writing "1 + 1" and you write "1 – 1" that's throw exception by JavaCC, because parser can't found "1" flowed by "+" and "1", but has found "-" in place of  "+" and can't continue."

Is this understood?

                      void                    Start():{} {          (          Require()          "          ."          )+          (          StatementExpression()          )*          }

This is an enter point for St4tic parsing without it, a parser can't be started. For this rule, it is mandatory to specify a "require" (if you notice "+", one or many) and after it a program instructions (notice "*", no-one or many):

                      void                    Require():{} {          "          require"          (          <          IDENTIFIER          >          )+          }

Here for packages importation can be one word after "require" or many like :

require java . require java lang . ...

And after importation, we can write a St4tic script, "statement expression" :

                      void                    StatementExpression():{} {   VariableDeclaration() |          LOOKAHEAD(2) VariableAssign() | JavaStaticMethods() | IfExpression() | WhileExpression() }

"statement expression" is program body or algorithm can contain many variables declarations, variables assignments, logical tests (if;while) or Java methods calling (remember in St4tic just public static methods).

                      void                    VariableDeclaration():{} {          "          def"          VariableName()          "          ="          MathExpression()          "          ."          }                      void                    VariableAssign(): {} {       VariableName()          "          ="          MathExpression()          "          ."          }

As you can see, a variable declaration and variable assignment is identical, just in declaration we need to start with "def" for defining variables.

                      void                    JavaStaticMethods():{} {       < IDENTIFIER >        (          "          :"          < IDENTIFIER >        )+          "          ("          MathExpression() (          "          ,"          MathExpression() )*          "          )"          "          ."          }

Like his name, =) invoking just static methods, as this rule:

ClassName:[Method|Members]( number ) example "System:out:println(1)", like Java?

Yes, just by changing "dot[.]" by "colon[:]".

                      void                    IfExpression():{} {          "          if"          RelationalExprssion()          "          do"          (               StatementExpression()             ) *          "          stop"          }                      void                    WhileExpression():{} {          "          while"          RelationalExprssion()          "          do"          (               StatementExpression()             ) *          "          stop"          }

Easy and simple "IF" and "WHILE" rules. Finally, you can see a full grammar source code, for now it's just empty parser just for checking a syntax without interpreting it (no result). In the next chapters, we add an interpreter for it.

            options          {          JDK_VERSION          =          "          1.5";          STATIC          =          false; }          PARSER_BEGIN(St4tic)                      package                    st4tic;                      import                    st4tic.syntaxtree.*;                      import                    st4tic.visitor.*;                      public                                class                    St4tic {                      public                                static                                void                    main(String          args[]) {                      try                    {       Start start =                      new                    St4tic(            new                    java.io.StringReader(          "          require java lang.\n"          +          "          def var = 13.\n"          +          "          while var > 0  do\n"          +          "          System:out:println( var ).\n"          +          "          var = var - 1.\n"          +          "          stop.\n"          ) ).Start();       start.accept(                      new                    DepthFirstVisitor () );       System.out.println("          Right! no errors founded! =)");     }                      catch                    (Exception e) {       System.out.println("          Oops.");       System.out.println(e.getMessage());     }   } }          PARSER_END(St4tic)            SKIP          : {          "                      "          |          "          \t"          |          "          \n"          |          "          \r"          | <"          \""          (~["          \n","          \r"])* ("          \n"|"          \r"|"          \r\n")> }            TOKEN          :  {       < REQUERE:          "          require"          > |     < IF:          "          if"          > |     < WHILE:          "          while"          > |     < DO:          "          do"          > |     < STOP:          "          stop"          > |     < DEF :          "          def"          > }            TOKEN          :  {       < DOT:          "          ."          > |     < COLON:          "          :"          > |     < EQ:          "          =="          > |     < GT:          "          >"          > |     < LT:          "          <"          > |     < GE:          "          >="          > |     < LE:          "          <="          > |     < NE:          "          !="          > |     < PLUS:          "          +"> |     < MINUS:          "          -"          > |     < MUL:          "          *"          > |     < DIV:          "          /"          > |     < MOD:          "          %"          > |     < ASSIGN:          "          ="          > }            TOKEN          :  {          <          INTEGER_LITERAL:          ["          1"-"          9"]          (["          0"-"          9"])*          |          "          0"          >          }          TOKEN          :  {   < IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* > |  < #LETTER: ["          _","          a"-"          z","          A"-"          Z"] > |  < #DIGIT: ["          0"-"          9"] > }                           void                    Start():{} {   (     Require()          "          ."          )+    (     StatementExpression()   )* }                        void                    Require():{} {          "          require"          (         < IDENTIFIER >       )+ }                        void                    MathExpression():{ } {   AdditiveExpression() }                      void                    AdditiveExpression():{} {   MultiplicativeExpression() ( (          "          +"          |          "          -"          ) MultiplicativeExpression() )* }                      void                    MultiplicativeExpression():{} {   UnaryExpression() ( (          "          *"          |          "          /"          |          "          %"          ) UnaryExpression() )* }                      void                    UnaryExpression():{} {          "          ("          MathExpression()          "          )"          | < INTEGER_LITERAL > | VariableName() }                        void                    RelationalExprssion():{} {       RelationalEqualityExpression() }                      void                    RelationalEqualityExpression():{} {       RelationalGreaterExpression()       (         (          "          =="          |          "          !="          )         RelationalGreaterExpression()       )* }                      void                    RelationalGreaterExpression():{} {       RelationalLessExpression()       (         (          "          >"          |          "          >="          )          RelationalLessExpression()       )* }                      void                    RelationalLessExpression():{} {       UnaryRelational()       (         (          "          <"          |          "          <="          )          UnaryRelational()        )* }                      void                    UnaryRelational():{} {          < INTEGER_LITERAL > | VariableName() }                         void                    IfExpression():{} {          "          if"          RelationalExprssion()          "          do"          (               StatementExpression()             ) *          "          stop"          }                        void                    WhileExpression():{} {          "          while"          RelationalExprssion()          "          do"          (               StatementExpression()             ) *          "          stop"          }                        void                    VariableDeclaration():{} {          "          def"          VariableName()          "          ="          MathExpression()          "          ."          }                      void                    VariableAssign(): { } {       VariableName()          "          ="          MathExpression()          "          ."          }                      void                    VariableName():{} {       < IDENTIFIER > }                      void                    JavaStaticMethods():{} {       < IDENTIFIER >        (          "          :"          < IDENTIFIER >        )+          "          ("          MathExpression() (          "          ,"          MathExpression() )*          "          )"          "          ."          }                        void                    StatementExpression():{} {   VariableDeclaration() |          LOOKAHEAD(2) VariableAssign() | JavaStaticMethods() | IfExpression() | WhileExpression() }

6.1 – Code Generation

It is simple and easy. You just need to click "compile with JavaCC" in the context menu or just by saving a file (if you have auto-compilation). The results are like this:

image002.jpg

Maybe if you copy and paste it, you can get socked by gentle error, if you have created your JTB file in another package.

If you have to do it, try to change a package name from JTB file and in secret place

"Project-Properties > JavaCC Options > Tab JTB Option > in p (default) = your new package name", I hope now I'm not responsible for your errors I have to give you a secret solution.

If you're lost here, no worries. You still have many credits to restart, maybe now 97 credits!

For trying a parser, you need just to run the St4tic.java file, result is:

"Right! No errors found!"

For editing or testing another code, edit the code here on your own:

Start start =                      new                    St4tic(            new                    java.io.StringReader(          "          require java lang.\n"          +          "          def var = 13.\n"          +          "          while var > 0  do\n"          +          "          System:out:println( var ).\n"          +          "          var = var - 1.\n"          +          "          stop.\n"          ) ).Start();

6.2 – Using Reflection

For using Java reflection, we need to create a small class for doing it:

  • static method full-identifier : parameters (string : class-name ) : return string
  • static method exists-field : parameters ( object : class-instance, string : field-name) : return boolean
  • static method get-field-object : parameters ( object : class-instance, string : field-name) : return object
  • static method exists-method : parameters( object : class-instance, string : method-name, st4tic-value[] : args)  : return boolean
  • static method invoke-static-method : parameters ( object : class-instance, string : method-name, st4tic-value[] : args ) : return object
  • static method push-package : parameters( string : package-name) : return void
  • static method make-object : parameters( string : class-name) :return class

The complicated method is an invocation ofstatic methods (or all methods in general), because in this step we need to choose the right types for parameters, unlike the compiler that can automatically cast Java native objects (integer to double or long to float, etc.).

Maybe you don't see a real problem, but imagine if you have

  • class X;
  • and class Z extends X;
  • and you have a method myMethod( X x );

If you pass an instance of class Z in method myMethod and you compile it, your code is accepted with no errors.

But, if you use reflection for do it, this is where the holy of all errors shows himself. Like, "method does not exist," or "error in object type," because in reflection automatically casting does not exist. And you need to do it by yourself.

Okay in our case, not to worry as we have a simple method:

          @SuppressWarnings("          unchecked")                      public                                static                    Object invokeStaticSubroutine(           Object classInstance,          String          methodName, St4ticValue ... args){                                   try          {                   Class clazz = classInstance                      instanceof                    Class ?  		(Class)classInstance : classInstance.getClass();                                          if          ( args !=          null          ){                                                  LinkedList<Class> params =                      new                    LinkedList<Class>();                      for          ( St4ticValue arg : args ){                               params.add( arg.getType() );                         }                          Method method = clazz.getMethod(methodName,                             params.toArray(            new                    Class[]{}));                                                   LinkedList<Object> values =                      new                    LinkedList<Object>();                      for          ( St4ticValue arg : args ){                               values.add(arg.getValue());                         }                      return                    method.invoke(classInstance,  			values.toArray(            new                    Object[]{}));                   }                                         else          {                         Method method = clazz.getMethod(methodName,                      new                    Class[]{});                      return                    method.invoke(classInstance,                      new                    Object[]{});                   }              }                      catch                    (SecurityException se) {              }                      catch                    (NoSuchMethodException nsme) {              }                      catch                    (IllegalArgumentException iae) {              }                      catch                    (IllegalAccessException iae) {              }                      catch                    (InvocationTargetException ate) {              }                      return                    null;       }

6.3 - Core Creation

Core package is the heart for St4tic data manipulation, we have just four classes.

image003.gif

(generated by doxygen)

It is a very simple class and you view them in source code. Just getters, setters and child's finding.

6.4- Making Interpreter

For making an easy interpreter, I have separated it to another package called "interpreter" and creating an interface content all needed methods called "Interpret" finally I have implemented it in class called "Interpreter."

The methods in interface "Interpret" has been copied from interface st4tic.visitor.Visitor and changing his signature, like Alef++

            from public void visit(Require n);

            to public Object visit(Require node, St4ticScope scope, Object ... objects);

just visit(Start node) is not changed because this method is enter or start point for St4tic interpreter.

                      public                    Object visit(Start node)                      throws                    Exception {                            Enumeration importedPackagesEnum = node.f0.elements();                      while          ( importedPackagesEnum.hasMoreElements() )             {                                      NodeSequence ns = (NodeSequence) importedPackagesEnum.nextElement();                   St4ticReflection.pushPackage(                      this          .visit 			( (Require) ns.elementAt(          0          ),          null).toString() );             }                                    if          ( node.f1.size() >          0          )              {                                       St4ticScope parent =                      new                    St4ticScope(          null          );                    Enumeration statement = node.f1.elements();                      while          ( statement.hasMoreElements() )                    {                      this          .visit 			( (StatementExpression)statement.nextElement() , parent);                    }              }                      return                    null;       }

A second important method is variables declaration and his life-cycle, if scope is destroyed all children have also lost without using GC.

                      public                    Object visit(VariableDeclaration node, St4ticScope scope,                   Object... objects)                      throws                    Exception {                           St4ticVariable var =                      new                    St4ticVariable();             var.setVariableName(                      this          .visit( node.f1 , scope, objects).toString() ) ;             var.setVariableValue( (St4ticValue)                      this          .visit(node.f3, scope, objects) );                           scope.pushChild( var.getVariableName() , var );                      return                    null;       }

Finally, a method for invoking a Java static method.

                      public                    Object visit(JavaStaticMethods node, St4ticScope scope,                   Object... objects)                      throws                    Exception {                                      String          identifier = St4ticReflection.fullIdentifier( node.f0.tokenImage ) ;                      if          ( identifier !=          null          )             {                                      Object currentObject = St4ticReflection.makeObject          ( identifier );                      if          ( currentObject !=          null          ){                         Enumeration e = node.f1.elements();                                               while          ( e.hasMoreElements() )                         {                               NodeSequence ns = (NodeSequence) e.nextElement();                      if          ( St4ticReflection.existsField          ( currentObject , ns.elementAt(          1          ).toString() ) )                               {                                     currentObject = St4ticReflection.getFieldObject(                                        currentObject , ns.elementAt(          1          ).toString() );                               }                      else                    {                                     LinkedList<St4ticValue> params =                      new                    LinkedList<St4ticValue>();                                     params.add( (St4ticValue)                      this          .visit 					(node.f3, scope, objects) );                                     Enumeration eVal = node.f4.elements();                      while          ( eVal.hasMoreElements() )                                     {                                           NodeSequence nsVal =                                               (NodeSequence) eVal.nextElement();                                           params.add( (St4ticValue)                      this          .visit(                                               (MathExpression) nsVal.elementAt(1) ,  					scope,                                               objects) );                                     }                                                            if          ( St4ticReflection.existsSubroutine( currentObject ,                                         ns.elementAt(          1          ).toString() , params.toArray(                      new                    St4ticValue[]{} )) )                                     {                      return                    St4ticReflection.invokeStaticSubroutine(                                               currentObject ,  					ns.elementAt(          1          ).toString() ,                                               params.toArray(                      new                    St4ticValue[]{} )) ;                                     }                      break          ;                               }                         }                   }             }                      return                    null;       }

7- System:out:println( 1 + var  ).

Now is the time of truth. You go to Eclipse or your favorite text editor and you create a text file called "my-first-programming-language.st4" and type in the first line:

require java lang .        

In second line :

def  var =          2          .        

In last line :

System:out:println(          1          + var ) .        

You go to Eclipse "Run..." properties and add in arguments "my-first-programming-language.st4" finally press "Run", or if you use binary (JAR file) you can just type in your console:

$: java -jar st4tic.jar my-first-programming-language.st4        

And you got a very nice output :

3        

Congratulations! You win and thank you for playing, this article is over.

8- Summary

I want to write a funny and educative article because this topic is very large and big, if you read classical articles they can be discouraging, and become very hard. So now I hope you are familiarized with JavaCC and Java reflection.

  • JavaCC : is tool like Yacc for generating a parser with Java code source from grammars.
  • Java reflection : is library to accessing and manipulating dynamically Java objects.
  • Keywords : you can use any language for your keywords (Arabic, Russian, etc...)

Hope you have enjoyed the article. Please give your suggestions and feedback for further improvement. Again, thanks for reading.

how to create your own coding language

Source: https://www.codeproject.com/Articles/50377/Create-Your-Own-Programming-Language

Posted by: dixonaname1987.blogspot.com

0 Response to "how to create your own coding language"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel