Implemented increment in CPython. This time, I will introduce the outline of increment and the knowledge obtained. In the next implementation edition, we will look at the implementation method in chronological order.
All three times plus extra edition. Overview and summary of adding post-increment to CPython Implementation of CPython with post-increment List of all changes when adding post-increment to CPython Extra edition of adding post-increment to CPython
First of all, please have a brief summary of the results slide.
It works like this.
>>> i=0
>>> i++
0
>>> i
1
>>> lst=[x for x in range(5)]
>>> lst
[0, 1, 2, 3, 4]
>>> lst[0]++
0
>>> lst
[1, 1, 2, 3, 4]
>>> class cls:
... a=5
...
>>> cls_obj=cls()
>>> cls_obj.a
5
>>> cls_obj.a++
5
>>> cls_obj.a
6
Increment can be implemented not only for variables but also for lists and member variables. In addition, the evaluation is returned at the same time as the variable is rewritten.
Here's what I found by looking at the CPython 3.5.0 source code when implementing increments.
The Python script is executed as follows.
--Lexical analysis - Include/tokenizer.h - Parser/tokenizer.c
There is a page called 23. Changing CPython ’s Grammar in Python Developer's Guide, so you can refer to it. However, this is not enough words, so I've summarized some more specific changes.
When you want to change the reserved word to another word such as "I want to write foreach instead of for"
--Sometimes you just need to change Grammar / Grammar appropriately
--You only need to change the 'for'
part to " ('for' |'foreach')
, but if you rewrite the deep'elif', it will be in Python / ast.c. It seems that it is necessary to change around ast.c because it will fall with the assert of (unconfirmed)
If you want to use a symbol string that is not used in Python, such as "I want to be able to use! Instead of for" --In addition to modifying Grammar / Grammar as above, define tokens in Include / tokenizer.h, Parser / tokenizer.c
If you want to add a grammar that uses symbols that are already used in other meanings
-For example, list comprehension[x + 1 | x <- range(10)]
Change to be able to write (|
Is already used in the sense of bit OR)
-Since the carved tokens are passed to the automatically generated parser in sequence, the bit operation is performed at the timing before that.|Or list comprehension|I thought it would be good to judge whether it was, but I gave up because it would take time to make this from scratch.
――I think it should be done by the parser in the first place, but is the Python grammar, which is originally LL (1), no longer LL (1)? → It seems necessary to read the automatic generation part of the parser that I did not touch and know the ability of the parser
--I didn't quite understand (for that reason, in this experiment, I implemented using $
instead of |
and cheated)
If you want to define some syntactic sugar --In addition to tokenizer. *, Grammar, it seems good to make it into a synonymous tree with Python / ast.c
If you want to add a grammar that goes beyond the framework of expressions and sentences or does not exist in existing ones --tokenizer, Grammar, .. Sometimes compile.c should also spit out nice bytecode. If necessary, also define new opcodes and their interpretations
By the way, if you make a destructive change, you will not be able to compile the library at the time of make install
.
In this increment implementation, there was at least one expression ++ 2
and an error was thrown (it is a mystery why it is written like that in the first place).
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
while_stmt: 'while' test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
Recommended Posts