@thomthom said:
Can a Python lexer work with NP++?
I've no idea. Took a look. My lexer is a Decaf lexer, written in Python. Decaf (Java without the Jitters) is a beginners language I've designed and am working on, but not often enough.
Decaf is quite like Python in most things, except suites. I actually have decided to implement suites:
if condition
no braces
around this
block of code
# no braces, no begin/end
Fortunately for your requirement, Decaf's lexer still looks for {} around statement blocks.
Unfortunately, Ruby persuaded me to add trailing IF and UNLESS but they're not there yet, either. I could add those soon.
Here's the language description. Before you go here, "lexical analysis" and "lexing" are Decaf-described as "breaking the language into WORDS" (not tokens). Parsing is picking out PHRASES (not expressions) and SENTENCES (aka statements). The code and doc is in various stages of being converted to Decaf-speak.
# Decaf's EBNF grammar
# grammar of the grammar;
# 'x' the character 'x'
# 'xxx' the character string 'xxx' (as a separate word)
# x | y either x or y
# [ x ] 0 or 1 x
# x* 0 or more x
# x+ 1 or more x
# {} grouping, example;
# "exp{',' exp}+" - "exp" followed by one or more "',' exp"
# < xx > comment (documentation of the grammar)
# full_name, nm ;;=
# x 'full_name' is defined as 'x', 'nm' is short for 'full_name'
#
# xxx; definition continued, next line
#
# name ;;=
# x
# y 'name' is defined as 'x | y'
#
# xxx_operator, xx ;;=
# < operators of type 'xxx' listed here >
#---------------------- words -----------------------
constant_e, con_e ;;=
'E' = 2.71828182845904523536
constant_pi, con_pi ;;=
'PI' = 3.14159265358979323846
comment_eol_word, cmt_w ;;=
COMMENT_EOL
end_of_line_word, eol_w ;;=
END_OF_LINE
end_of_input_word, eoi_w ;;=
END_OF_INPUT
end_of_statement_word, eos_w ;;=
END_OF_STATEMENT
line_continuation_word, lcon_w ;;=
LINE_CONTINUATION
whitespace_word, white_w ;;=
WHITESPACE
constant_integer_word, con_i_w ;;=
CONSTANT_INTEGER
constant_decimal_word, con_d_w ;;=
CONSTANT_DECIMAL
lbrace_word, lbrace_w ;;=
LBRACE
lbracket_word, lbrkt_w ;;=
LBRACKET
lparen_word, lprn_w ;;=
LPAREN
malformed_number_word, m_num_w ;;=
MALFORMED_NUMBER
multiline_string_word, ml_str_w ;;=
MULTILINE_STRING
name_word, name_w ;;=
NAME
operator_word, op_w ;;=
OPERATOR
rbrace_word, rbrace_w ;;=
RBRACE
rbracket_word, rbrkt_w ;;=
RBRACKET
reserved_word_word, res_wrd_w ;;=
RESERVED_WORD
rparen_word, rprn_w ;;=
RPAREN
string_word, str_w ;;=
STRING
unclosed_multiline_string_word, unc_mls_w ;;=
UNCLOSED_MULTILINE_STRING
unclosed_string_word, unc_str_w ;;=
UNCLOSED_STRING
unknown_character_word, unk_chr_w ;;=
UNKNOWN_CHARACTER
#------------------- expressions --------------------
arithmetic_operator, arth_op_p ;;=
'+' | '-' | '*' | '/' | '^' | '%'
copula_operator, cop_op_p ;;=
'=>'
comparison_operator, cmp_op_p ;;=
'GT' | 'GE' | 'EQ' | 'LE' | 'LT' | 'NE' | 'IN'
binary_logical_operator, log_op_p ;;=
'AND' | 'OR'
unary_logical_operator, not_op_p ;;=
'NOT'
other_operator, oth_op_p ;;=
'.' | ',' | ';'
expression, expr ;;=
arithmetic_expression
comparison_expression
logical_expression
parenthesized_expression
list_expression
range_expression
subscript_expression
function_call
selection_expression
arithmetic_expression, arth_exp ;;=
operand arithmetic_operator operand
'-' operand
comparison_expression, cmp_exp ;;=
operand comparison_operator operand
logical_expression, log_exp ;;=
operand binary_logical_operator operand
unary_logical_operator operand
'TRUE' | 'FALSE'
comparison_expression
parenthesized_expression, paren_exp ;;=
'(' expression ')'
list_expression, lst_exp ;;=
expression {',' expression}+
range_expression, rng_exp ;;=
operand ';' [operand]
';' operand < operands must be integers >
subscript_expression, sub_exp ;;=
'[' range_expression{',' range_expression}* ']'
function_call, func_exp ;;=
parenthesized_expression function_name
function_name, func_nm ;;=
NAME < name of defined or imported function >
selection_expression, sel_exp ;;=
NAME{'.' NAME}+
operand, oprand ;;=
NAME
constant
expression
constant, cnstnt ;;=
CONSTANT_INTEGER | CONSTANT_DECIMAL | STRING | MULTILINE_STRING
#-------------------------- other phrases ----------------------------
address, addr_p ;;=
PATHNAME < address on local machine >
URL < address on any machine >
capability_name, cp_nm_p ;;=
NAME < of capability type >
right_hand_side, rhs_p ;;=
rhs_qualifiers [type_name] variable_name
rhs_qualifiers, rhs_ql_p ;;=
[ 'LOCAL' | 'GLOBAL' ] [ 'RW' | 'RO' ] [range_expression]
return_types_list, rt_tps_p ;;=
type_name{, type_name}*
type_name, tp_nm_p ;;=
base_type
built_in_type
defined_type
base_type, bs_tp_p ;;=
'BIT' | 'BYTE' | 'CHAR' | 'DEC' | 'GROUP' |;
'INT' | 'NAMELIST' | 'TYPE' | 'SUB'
built_in_type, bi_tp_p ;;=
< see http://www.MartinRinehart.com/posters/decaf-object-library.html >
defined_type, def_tp_p ;;=
< types defined in page or imported from other pages >
variable_name, vr_nm_p ;;=
NAME < word, type==NAME, textValue contains name >
condition, cond_p ;;=
logical_expression
list_name, ls_nm_p ;;=
NAME < that identifies a list, array, NAMELIST or group >
self_name_p, self_p ;;=
'ME'
#--------------------- statements -----------------------
statement, smt ;;=
declaration_statement
action_statement
declaration_statement, dec_smt ;;=
var_declaration_statement
include_statement
type_declaration_statement
sub_declaration_statement
action_statement, act_smt ;;=
expression_statement
block_statement
if_statement
foreach_statement
while_statement
loop_statement
break_statement
return_statement
var_declaration_statement, vr_dec_smt ;;=
rhs_qualifiers type_name variable_name
include_statement, incl_smt ;;=
'INCLUDE' address {',' address}*
type_declaration_statement, tp_dec_smt ;;=
{ 'CLASS' | 'CAPABILITY' } NAME1;
[ 'EXTENDS' NAME2 ];
['CANDO' capability_name [, capability_name]* ];
[ MULTILINE_STRING ] < documentation >;
action_statement
sub_declaration_statement, sb_dec_smt ;;=
'SUB' '(' [ list_expression ] ')' NAME '(' [ return_types_list ] ')';
action_statement
expression_statement, exp_smt ;;=
expression [ copula_operator right_hand_side{',' right_hand_side}* ]
block_statement, blk_smt ;;=
INDENT;
[ MULTILINE_STRING ] < documentation >;
statement*;
UNINDENT
if_statement, if_smt ;;=
'IF' condition 'THEN' action_statement [ 'ELSE' action_statement ]
foreach_statement, for_smt ;;=
'FOR' 'EACH' variable_name IN list_name DO action_statement
while_statement, whl_smt ;;=
'WHILE' condition 'DO' action_statement
loop_statement, loop_smt ;;=
'LOOP'
break_statement, brk_smt ;;=
'BREAK'
return_statement, ret_smt ;;=
'RETURN' expression < expression may be expression_list >
#---------------------- program -----------------------
program, prgrm ;;=
[ MULTILINE_STRING ] < documentation >;
statement*;
END_OF_INPUT
# end of decaf.bnf
This is drifting off-topic. Unless someone (anyone?) else is interested, maybe we could do this via Email. MartinRinehart at gmail dot com.