Front page | perl.perl5.porters |
Postings from January 2016
reworked context stack system
Thread Next
From:
Dave Mitchell
Date:
January 4, 2016 16:50
Subject:
reworked context stack system
Message ID:
20160104165011.GB4007@iabyn.com
The just-pushed branch smoke-me/davem/contextsB4 contains about 200 commits
that heavily rework the scope / context stack system (all that PUSHBLOCK,
POPSUB etc stuff). If it smokes ok, and if there aren't any objections,
I intend to merge it in a few days' time.
The changes are all to non-api internals, although a few distributions do
rely on them. grep.cpan.me shows the following XS-containing distributions
as likely to break:
2008 EDPRATOMO/Algorithm-Permute-0.12
2013 HUGMEIR/Params-Lazy-0.005
2015 ZEFRAM/Data-Alias-1.20
2015 ANDREASG/Devel-Unwind-0.01
2015 RSCHUPP/PAR-Packer-1.028
2015 VPIT/Scope-Upper-0.28
Benefits:
Entering, iterating and leaving scopes is now considerably faster; see the
benchmarks at the end of this email for details. Any code that calls e.g.
a sub or method or 'for' loop which has a small body, is likely to notice
the difference. For example this runs about 40% faster (x86_64 system):
my @a = 1..1000;
for (1..30_000) {
$_++ for @a;
}
and this is 23% faster:
sub inc { $_[0]++ }
my @a = 1..1000;
for (1..30_000) {
inc($_) for @a;
}
Conversely, big subs and loops probably wont notice.
This branch fixes various bugs related to dying on scope exit: things had
a habit of leaking or being double-freed; it should also make fixing future
bugs easier, as scope exit now undoes/restores things in the logical reverse
order of scope entry.
The new system for pushing, popping and manipulating contexts is now
documented in a new section in perlguts, L<perlguts/Dynamic Scope and the
Context Stack>. Formerly the whole context stack system was almost
completely undocumented.
Significant changes:
PUSHBLOCK(), POPSUB() etc have been replaced with static inline subs such
as cx_pushblock(), cx_popsub() etc. These use function args rather than
implicitly relying on local vars such as gimme and newsp being available.
Also their functionality has changed: in particular, cx_popblock() no
longer decrements cx_stack_ix. The ordering of the steps in the pp_leave*
functions involving cx_popblock(), cx_popsub() etc has changed. See the
documentation in perlguts for details on how to use them.
Various macros, which now consistently have a CX_ prefix, have been added:
CX_CUR(), CX_LEAVE_SCOPE(), CX_POP()
or renamed:
CX_POP_SAVEARRAY(), CX_DEBUG(), CX_PUSHSUBST() ,CX_POPSUBST()
cx_pushblock() now saves PL_savestack_ix and PL_tmps_floor, so pp_enter*
and pp_leave* no longer do
ENTER; SAVETMPS; ....; LEAVE
cx_popblock() now also restores PL_curpm.
In dounwind(), the current savestack frame is processed before each context
is popped for every context type; formerly this was only done for sub-like
context frames. This action has been removed from POPSUB/cx_popsub and
placed into its own macro, CX_LEAVE_SCOPE(cx), which must be called before
cx_popsub etc.
dounwind() Now also does a cx_popblock() on the last popped frame (formerly
it only did the cx_popsub() etc actions on each frame).
The temps stack is now freed on scope exit; previously, temps created
during the last statement of a block wouldn't be freed until the next
nextstate following the block (apart from an existing hack that did this
for recursive subs in scalar context); and in something like f(g()),
the temps created by the last statement in g() would formerly not be
freed until the statement following the return from f().
The mortal copying of args returned when leaving scopes is now
special-cased for undef/IV/RV for efficiency's sake.
Most values that were saved on the savestack on scope entry are now
saved in suitable new fields in the context struct, and saved and
restored directly by cx_pushfoo() and cx_popfoo(), which is much faster.
Various context struct fields have been added, removed or modified.
The handling of @_ in cx_pushsub() and cx_popsub() has been considerably
tidied up, including removing the argarray field from the context struct,
and extracting out some common (but rarely used) code into a separate
function, clear_defarray(). Also, useful subsets of cx_popsub() which had
been unrolled in places like pp_goto have been gathered into the functions
cx_popsub_args() and cx_popsub_common().
pp_leavesub and pp_leavesublv now use the same function as the rest of the
pp_leave*'s to process return args.
CXp_FOR_PAD, CXp_FOR_GV flags have been added, and CXt_LOOP_FOR has been
split into CXt_LOOP_LIST, CXt_LOOP_ARY.
Some variables formerly declared in dMULTICALL (but not documented) have
been removed.
There is some more work I would like to do at some point, but not until
sometime post-5.24. In particular, I want to look into whether it's
worthwhile giving grep, map and XS subs their own context types; and
to add a PL_cur_sub_cxix var, which points to the current sub/eval/format
context, so that (among other things), determining scalar/list context
for the last statement in a sub doesn't need to involve scanning the context
stack for the last sub call.
Benchmarks:
These are the results of Porting/bench.pl on perl pre- and post- this branch,
running just the call:: and loop:: benchmarks, sorted by increasing Ir.
100 = unchanged; 200 (good) = takes half as many instructions etc as before
Almost every value is > 100, and a simple empty block, {1;} using less
half the amount of instructions.
Ir Dr Dw COND IND COND_m IND_m
------ ------ ------ ------ ------ ------ ------
101.95 102.65 101.79 102.52 113.24 140.12 117.39 call::sub::args3_ret3str
102.78 102.55 100.80 103.75 120.69 102.94 100.00 loop::grep::expr_3int
103.79 103.56 101.02 106.00 134.29 140.40 100.00 loop::map::expr_3int
105.32 103.19 106.13 104.14 114.52 76.92 125.00 loop::grep::block_3int
106.21 107.11 103.86 107.08 162.50 152.04 121.05 call::sub::args3_ret3temp
108.34 105.77 107.89 108.07 134.09 140.40 100.00 loop::map::block_3int
112.55 113.63 108.69 114.90 187.50 150.51 121.05 call::sub::args3_ret3
112.96 113.11 109.37 115.24 145.95 290.24 133.33 call::sub::recursive
113.18 110.96 117.80 106.15 103.12 100.00 100.00 loop::while::i4
113.40 111.49 109.57 116.14 147.06 372.73 146.15 call::sub::args2_ret1temp
115.66 115.84 111.31 118.50 164.71 152.63 146.15 call::sub::args2_ret1
118.59 118.77 115.24 118.73 164.29 147.62 140.00 call::sub::args3
124.16 120.00 134.29 112.12 106.25 137.93 100.00 loop::while::i1
127.19 125.18 135.00 126.26 109.09 102.56 109.09 loop::for::lex_array4
128.70 130.71 132.93 128.25 108.00 103.45 109.09 loop::for::lex_list4
128.72 124.83 140.23 128.72 108.33 222.22 108.33 loop::for::pkg_array4
129.41 129.91 138.12 128.50 107.41 200.00 108.33 loop::for::pkg_list4
129.98 129.37 138.59 117.28 118.52 137.93 107.69 loop::for::lex_range4
130.14 127.21 138.89 118.36 124.14 133.33 114.29 loop::for::pkg_range4
130.16 122.29 125.23 130.65 183.33 - 133.33 loop::do
130.49 126.07 142.94 130.11 108.70 137.93 108.70 loop::for::defsv_array4
131.22 131.45 140.59 129.89 107.69 133.33 108.70 loop::for::defsv_list4
132.37 128.80 141.18 119.80 125.00 100.00 114.81 loop::for::defsv_range4
132.81 129.51 141.25 129.61 113.64 50.85 113.64 loop::for::my_array4
134.29 135.19 138.92 131.64 112.00 133.33 113.64 loop::for::my_list4
135.08 133.49 144.02 120.42 122.22 100.00 111.54 loop::for::my_range4
136.89 131.41 146.97 124.79 123.53 200.00 125.00 loop::for::pkg_range1
137.70 136.59 147.46 123.17 113.33 125.64 114.29 loop::for::lex_range1
141.29 134.51 151.24 127.94 125.00 102.50 126.67 loop::for::defsv_range1
144.56 137.65 157.29 139.49 111.54 50.85 111.54 loop::for::next4
146.18 143.72 155.93 128.85 120.00 100.00 121.43 loop::for::my_range1
148.78 138.70 165.69 145.56 116.67 157.89 116.67 loop::for::pkg_array1
149.61 142.71 160.23 141.56 120.00 102.56 120.00 loop::for::lex_array1
151.20 145.45 168.04 147.06 116.67 150.00 116.67 loop::for::pkg_list1
154.24 151.12 162.65 148.61 120.00 103.45 120.00 loop::for::lex_list1
158.20 150.53 176.74 152.63 118.18 150.00 118.18 loop::for::defsv_array1
158.20 150.53 176.74 152.63 118.18 150.00 118.18 loop::for::defsv_list1
160.73 145.77 160.87 152.17 150.00 100.00 160.00 call::sub::empty
161.20 151.50 171.59 149.35 130.00 50.85 130.00 loop::for::my_array1
166.60 160.89 174.70 156.94 130.00 133.33 130.00 loop::for::my_list1
172.79 151.30 172.41 161.11 160.00 100.00 160.00 call::sub::amp_empty
209.56 173.33 212.12 200.00 133.33 100.00 100.00 loop::block
130.93 127.54 135.45 126.60 124.55 115.52 117.37 AVERAGE
--
Little fly, thy summer's play my thoughtless hand
has terminated with extreme prejudice.
(with apologies to William Blake)
Thread Next
-
reworked context stack system
by Dave Mitchell