summaryrefslogtreecommitdiff
blob: 7008f5ce3d99e0bdcc70f9015a0d373258d61c0e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
---
GLEP: 33
Title: Eclass Restructure/Redesign
Author: Brian Harring <ferringb@gentoo.org>,
        John Mylchreest <johnm@gentoo.org>
Type: Standards Track
Status: Deferred
Version: 1
Created: 2005-01-29
Last-Modified: 2014-01-17
Post-History: 2005-01-29, 2005-03-06, 2005-09-15, 2006-09-05
Content-Type: text/x-rst
---

Status
======

Approved by the Gentoo Council on 2005-09-15.  As of September 2006
this GLEP is on hold, pending future revisions.

Abstract
========

For any design, the transition from theoretical to applied exposes inadequacies 
in the original design.  This document is intended to document, and propose a 
revision of the current eclass setup to address current eclass inadequacies.

This document proposes several things- the creation of ebuild libraries, 'elibs', 
a narrowing of the focus of eclasses, a move of eclasses w/in the tree, the 
addition of changelogs, and a way to allow for simple eclass gpg signing.
In general, a large scale restructuring of what eclasses are and how they're
implemented.  Essentially version two of the eclass setup.


Terminology
===========

From this point on, the proposed eclass setup will be called 'new eclasses', the
existing crop (as of this writing) will be referenced as 'old eclasses'.  The
distinction is elaborated on within this document.


Motivation and Rationale
========================

Eclasses within the tree currently are a bit of a mess- they're forced to
maintain backwards compatibility w/ all previous functionality.  In effect,
their api is constant, and can only be added to- never changing the existing
functionality.  This obviously is quite limiting, and leads to cruft accruing in
eclasses as a eclasses design is refined.  This needs to be dealt with prior to
eclass code  reaching a critical mass where they become unmanageable/fragile
(recent pushes for eclass versioning could be interpreted as proof of this).

Beyond that, eclasses were originally intended as a method to allow for ebuilds
to use a pre-existing block of code, rather then having to duplicate the code in
each ebuild.  This is a good thing, but there are ill effects that result from
the current design.  Eclasses inherit other eclasses to get a single function- in
doing so, modifying the the exported 'template' (default src_compile, default
src_unpack, various vars, etc).  All the eclass designer was after was reusing a
function, not making their eclass sensitive to changes in the template of the
eclass it's inheriting.  The eclass designer -should- be aware of changes in the
function they're using, but shouldn't have to worry about their default src_*
and pkg_* functions being overwritten, let alone the env changes.

Addressing up front why a collection of eclass refinements are being rolled into
a single set of changes, parts of this proposal -could- be split into multiple
phases.  Why do it though?  It's simpler for developers to know that the first
eclass specification was this, and that the second specification is that, 
rather then requiring them to be aware of what phase of eclass changes is in 
progress.

By rolling all changes into one large change, a line is intentionally drawn in
the sand.  Old eclasses allowed for this, behaved this way.  New eclasses allow
for that, and behave this way.  This should reduce misconceptions about what is
allowed/possible with eclasses, thus reducing bugs that result from said
misconceptions.

A few words on elibs- think of them as a clear definition between behavioral 
functionality of an eclass, and the library functionality.  Eclass's modify 
template data, and are the basis for other ebuilds- elibs, however are *just* 
common bash functionality.

Consider the majority of the portage bin/* scripts- these all are candidates for 
being added to the tree as elibs, as is the bulk of eutils.


Specification
=============

The various parts of this proposal are broken down into a set of changes and
elaborations on why a proposed change is preferable.  It's advisable to the
reader that this be read serially, rather then jumping around.


Ebuild Libraries (elibs for short)
----------------------------------

As briefly touched upon in Motivation and Rationale, the original eclass design
allowed for the eclass to modify the metadata of an ebuild, metadata being the
DEPENDS, RDEPENDS, SRC_URI, IUSE, etc, vars that are required to be constant,
and used by portage for dep resolution, fetching, etc.  Using the earlier
example, if you're after a single function from an eclass (say epatch from
eutils), you -don't- want the metadata modifications the eclass you're
inheriting might do.  You want to treat the eclass you're pulling from as a
library, pure and simple.

A new directory named elib should be added to the top level of the tree to serve
as a repository of ebuild function libraries.  Rather then relying on using the
source command, an 'elib' function should be added to portage to import that
libraries functionality.  The reason for the indirection via the function is 
mostly related to portage internals, but it does serve as an abstraction such 
that (for example) zsh compatibility hacks could be hidden in the elib function.

Elib's will be collections of bash functions- they're not allowed to do anything
in the global scope aside from function definition, and any -minimal-
initialization of the library that is absolutely needed.  Additionally, they 
cannot modify any ebuild template functions- src_compile, src_unpack.  Since they are
required to not modify the metadata keys, nor in any way affect the ebuild aside
from providing functionality, they can be conditionally pulled in.  They also
are allowed to pull in other elibs, but strictly just elibs- no eclasses, just
other elibs.  A real world example would be the eutils eclass.

Portage, since the elib's don't modify metadata, isn't required to track elibs
as it tracks eclasses.  Thus a change in an elib doesn't result in half the tree
forced to be regenerated/marked stale when changed (this is more of an infra
benefit, although regen's that take too long due to eclass changes have been
known to cause rsync issues due to missing timestamps).  

Elibs will not be available in the global scope of an eclass, or ebuild- nor during the 
depends phase (basically a phase that sources the ebuild, to get its metadata).  Elib 
calls in the global scope will be tracked, but the elib will not be loaded till just before
the setup phase (pkg_setup).  There are two reasons for this- first, it ensures elibs are 
completely incapable of modifying metadata.  There is no room for confusion, late loading
of elibs gives you the functionality for all phases, except for depends- depends being the 
only phase that is capable of specifying metadata.  Second, as an added bonus, late 
loading reduces the amount of bash sourced for a regen- faster regens.  This however is minor,
and is an ancillary benefit of the first reason.

There are a few further restrictions with elibs--mainly, elibs to load can only be specified
in either global scope, or in the setup, unpack, compile, test, and install phases.  You can 
not load elibs in prerm, postrm, preinst, and postinst.  The reason being, for \*rm phases, 
installed pkgs will have to look to the tree for the elib, which allows for api drift to cause 
breakage.  For \*inst phases, same thing, except the culprit is binpkgs.

There is a final restriction--elibs cannot change their exported api dependent on the api 
(as some eclass do for example).  The reason mainly being that elibs are loaded once--not 
multiple times, as eclasses are.

To clarify, for example this is invalid.
::

	if [[ -n ${SOME_VAR} ]]; then
		func x() { echo "I'm accessible only via tweaking some var";}
	else
		func x() { echo "this is invalid, do not do it."; }
	fi


Regarding maintainability of elibs, it should be a less of a load then old
eclasses.  One of the major issues with old eclasses is that their functions are
quite incestuous- they're bound tightly to the env they're defined in.  This
makes eclass functions a bit fragile- the restrictions on what can, and cannot
be done in elibs will address this, making functionality less fragile (thus a
bit more maintainable).

There is no need for backwards compatibility with elibs- they just must work
against the current tree.  Thus elibs can be removed when the tree no longer
needs them.  The reasons for this are explained below.

Structuring of the elibs directory will be exactly the same as that of the new
eclass directory (detailed below), sans a different extension.

As to why there are so many restrictions, the answer is simple- the definition of
what elibs are, what they are capable of, and how to use them is nailed down as much as 
possible to avoid *any* ambiguity related to them.  The intention is to make it clear,
such that no misconceptions occur, resulting in bugs.

The reduced role of Eclasses, and a clarification of existing Eclass requirements
---------------------------------------------------------------------------------

Since elibs are now intended on holding common bash functionality, the focus of
eclasses should be in defining an appropriate template for ebuilds.  For example,
defining common DEPENDS, RDEPENDS, src_compile functions, src_unpack, etc.
Additionally, eclasses should pull in any elibs they need for functionality.

Eclass functionality that isn't directly related to the metadata, or src_* and
pkg_* funcs should be shifted into elibs to allow for maximal code reuse.  This
however isn't a hard requirement, merely a strongly worded suggestion.

Previously, it was 'strongly' suggested by developers to avoid having any code
executed in the global scope that wasn't required.  This suggestion is now a
requirement.  Execute only what must be executed in the global scope.  Any code
executed in the global scope that is related to configuring/building the package
must be placed in pkg_setup.  Metadata keys (already a rule, but now stated as
an absolute requirement to clarify it) *must* be constant.  The results of
metadata keys exported from an ebuild on system A, must be *exactly* the same as
the keys exported on system B.

If an eclass (or ebuild for that matter) violates this constant requirement, it
leads to portage doing the wrong thing for rsync users- for example, wrong deps
pulled in, leading to compilation failure, or dud deps.

If the existing metadata isn't flexible enough for what is required for a
package, the parsing of the metadata is changed to address that.  Cases where
the constant requirement is violated are known, and a select few are allowed-
these are exceptions to the rule that are required due to inadequacies in
portage.  Any case where it's determined the constant requirement may need to be 
violated the dev must make it aware to the majority of devs, along with the portage 
devs.  This should be done prior to committing.

It's quite likely there is a way to allow what you're attempting- if you just go
and do it, the rsync users (our user base) suffer the results of compilation
failures and unneeded deps being pulled in.

After that stern reminder, back to new eclasses.  Defining INHERITED and ECLASS
within the eclass is no longer required.  Portage already handles those vars if
they aren't defined.

As with elibs, it's no longer required that backwards compatibility be maintained
indefinitely- compatibility must be maintained against the current tree, but
just that.  As such new eclasses (the true distinction of new vs old is
elaborated in the next section) can be removed from the tree once they're no
longer in use.


The end of backwards compatibility...
-------------------------------------

With current eclasses, once the eclass is in use, its api can no longer be
changed, nor can the eclass ever be removed from the tree.  This is why we still
have *ancient* eclasses that are completely unused sitting in the tree, for
example inherit.eclass.  The reason for this, not surprisingly, is a portage 
deficiency: on unmerging an installed ebuild, portage used the eclass from the
current tree.

For a real world example of this, if you merged a glibc 2 years back, whatever
eclasses it used must still be compatible, or you may not be able to unmerge the
older glibc version during an upgrade to a newer version.  So either the glibc
maintainer is left with the option of leaving people using ancient versions out
in the rain, or maintaining an ever increasing load of backwards compatibility
cruft in any used eclasses.

Binpkgs suffer a similar fate.  Merging of a binpkg pulls needed eclasses from
the tree, so you may not be able to even merge a binpkg if the eclasses api has
changed.  If the eclass was removed, you can't even merge the binpkg, period.

The next major release of portage will address this- the environment that the
ebuild was built in already contains the eclasses functions, as such the env can
be re-used rather then relying on the eclass.  In other words, binpkgs and
installed ebuilds will no longer go and pull needed eclasses from the tree,
they'll use the 'saved' version of the eclass they were built/merged with.

So the backwards compatibility requirement for users of the next major portage
version (and beyond) isn't required.  All the cruft can be dropped.

The problem is that there will be users using older versions of portage that don't 
support this functionality- these older installations *cannot* use the 
new eclasses, due to the fact that their portage version is incapable of 
properly relying on the env- in other words, the varying api of the eclass will
result in user-visible failures during unmerging.

So we're able to do a clean break of all old eclasses, and api cruft, but we need 
a means to basically disallow access to the new eclasses for all portage versions 
incapable of properly handling the env requirements.

Unfortunately, we cannot just rely on a different grouping/naming convention within 
the old eclass directory.  The new eclasses must be inaccessible, and portage throws
a snag into this- the existing inherit function that is used to handle existing
eclasses.  Basically, whatever it's passed (inherit kernel or inherit
kernel/kernel) it will pull in (kernel.eclass, and kernel/kernel.eclass
respectively).  So even if the new eclasses were implemented within a
subdirectory of the eclass dir in the tree, all current portage versions would
still be able to access them.

In other words, these new eclasses would in effect, be old eclasses since older
portage versions could still access them.


Tree restructuring
------------------

There are only two way to block the existing (as of this writing) inherit
functionality from accessing the new eclasses- either change the extension of
eclasses to something other then 'eclass', or to have them stored in a separate
subdirectory of the tree then eclass.

The latter is preferable, and the proposed solution.  Reasons are- the current
eclass directory is already overgrown.  Structuring of the new eclass dir
(clarified below) will allow for easier signing, ChangeLogs, and grouping of
eclasses.  New eclasses allow for something akin to a clean break and have new
capabilities/requirements, thus it's advisable to start with a clean directory, 
devoid of all cruft from the old eclass implementation.

If it's unclear as to why the old inherit function *cannot* access the new
eclasses, please reread the previous section.  It's unfortunately a requirement
to take advantage of all that the next major portage release will allow.

The proposed directory structure is ${PORTDIR}/include/{eclass,elib}.
Something like ${PORTDIR}/new-eclass, or ${PORTDIR}/eclass-ng could be used
(although many would cringe at the -ng), but such a name is unwise.  Consider the
possibility (likely a fact) that new eclasses someday may be found lacking, and
refined further (version three as it were).  Or perhaps we want to add yet more
functionality with direct relation to sourcing new files, and we would then need
to further populate ${PORTDIR}.

The new-eclass directory will be (at least) 2 levels deep- for example:

::
	kernel/
	kernel/linux-info.eclass
	kernel/linux-mod.eclass
	kernel/kernel-2.6.eclass
	kernel/kernel-2.4.eclass
	kernel/ChangeLog
	kernel/Manifest

No eclasses will be allowed in the base directory- grouping of new eclasses will
be required to help keep things tidy, and for the following reasons.  Grouping
of eclasses allows for the addition of ChangeLogs that are specific to that
group of eclasses, grouping of files/patches as needed, and allows for
saner/easier signing of eclasses- you can just stick a signed
Manifest file w/in that grouping, thus providing the information portage needs
to ensure no files are missing, and that nothing has been tainted.

The elib directory will be structured in the same way, for the same reasons.

Repoman will have to be extended to work within new eclass and elib groups, and
to handle signing and committing.  This is intentional, and a good thing.  This
gives repoman the possibility of doing sanity checks on elibs/new eclasses.

Note these checks will not prevent developers from doing dumb things with eclass- 
these checks would only be capable of doing basic sanity checks, such as syntax checks.
There is no way to prevent people from doing dumb things (exempting perhaps repeated 
applications of a cattle prod)- these are strictly automatic checks, akin to repoman's
dependency checks.


The start of a different phase of backwards compatibility
---------------------------------------------------------

As clarified above, new eclasses will exist in a separate directory that will be
intentionally inaccessible to the inherit function.  As such, users of older
portage versions *will* have to upgrade to merge any ebuild that uses elibs/new
eclasses.  A depend on the next major portage version would transparently handle 
this for rsync users.

There still is the issue of users who haven't upgraded to the required portage
version.  This is a minor concern frankly- portage releases include new
functionality, and bug fixes.  If they won't upgrade, it's assumed they have
their reasons and are big boys, thus able to handle the complications themselves.

The real issue is broken envs, whether in binpkgs, or for installed packages.
Two options exist- either the old eclasses are left in the tree indefinitely, or
they're left for N months, then shifted out of the tree, and into a tarball that
can be merged.

Shifting them out of the tree is advisable for several reasons- less cruft in
the tree, but more importantly the fact that they are not signed (thus an angle
for attack).  Note that the proposed method of eclass signing doesn't even try
to address them.  Frankly, it's not worth the effort supporting two variations
of eclass signing, when the old eclass setup isn't designed to allow for easy
signing.

If this approach is taken, then either the old eclasses would have to be merged
to an overlay directory's eclass directory (ugly), or to a safe location that
portage's inherit function knows to look for (less ugly).

For users who do not upgrade within the window of N months while the old
eclasses are in the tree, as stated, it's assumed they know what they are doing.
If they specifically block the new portage version, as the ebuilds in the tree
migrate to the new eclasses, they will have less and less ebuilds available to
them.  If they tried injecting the new portage version (lying to portage,
essentially), portage would bail since it cannot find the new eclass.  
For ebuilds that use the new eclasses, there really isn't any way to sidestep 
the portage version requirement- same as it has been for other portage features.

What is a bit more annoying is that once the old eclasses are out of the tree,
if a user has not upgraded to a portage version supporting env processing, they 
will lose the ability to unmerge any installed ebuild that used an old
eclass.  Same cause, different symptom being they will lose the ability to merge 
any tbz2 that uses old eclasses also.

There is one additional case that is a rarity, but should be noted- if a user 
has suffered significant corruption of their installed package database (vdb).  This is 
ignoring the question of whether the vdb is even usable at this point, but the possibility
exists for the saved envs to be non usable due to either A) missing, or B) corrupted.
In such a case, even with the new portage capabilities, they would need
the old eclass compat ebuild.  

Note for this to happen requires either rather... unwise uses of root, or significant 
fs corruption.  Regardless of the cause, it's quite likely for this to even become an 
issue, the system's vdb is completely unusable.  It's a moot issue at that point.
If you lose your vdb, or it gets seriously damaged, it's akin to lobotomizing portage- 
it doesn't know what's installed, it doesn't know of its own files, and in general, 
a rebuilding of the system is about the only sane course of action.  The missing env is 
truly the least of the users concern in such a case.

Continuing with the more likely scenario, users unwilling to upgrade portage will
*not* be left out in the rain.  Merging the old eclass compat ebuild will provide 
the missing eclasses, thus providing that lost functionality.

Note the intention isn't to force them to upgrade, hence the ability to restore the
lost functionality.  The intention is to clean up the existing mess, and allow us
to move forward.  The saying "you've got to break a few eggs to make an omelet"
is akin, exempting the fact we're providing a way to make the eggs whole again
(the king's men would've loved such an option).


Migrating to the new setup
--------------------------

As has been done in the past whenever a change in the tree results in ebuilds
requiring a specific version of portage, as ebuilds migrate to the new eclasses,
they should depend on a version of portage that supports it.  From the users
viewpoint, this transparently handles the migration.

This isn't so transparent for devs or a particular infrastructure server however.
Devs, due to them using cvs for their tree, lack the pregenerated cache rsync
users have.   Devs will have to be early adopters of the new portage.  Older
portage versions won't be able to access the new eclasses, thus the local cache
generation for that ebuild will fail, ergo the depends on a newer portage
version won't transparently handle it for them.

Additionally, prior to any ebuilds in the tree using the new eclasses, the
infrastructure server that generates the cache for rsync users will have to
either be upgraded to a version of portage supporting new eclasses, or patched.
The former being much more preferable then the latter for the portage devs.

Beyond that, an appropriate window for old eclasses to exist in the tree must be
determined, and prior to that window passing, an ebuild must be added to the tree
so users can get the old eclasses if needed.

For eclass devs to migrate from old to new, it is possible for them to just
transfer the old eclass into an appropriate grouping in the new eclass directory,
although it's advisable they cleanse all cruft out of the eclass.  You can
migrate ebuilds gradually over to the new eclass, and don't have to worry about
having to support ebuilds from X years back.

Essentially, you have a chance to nail the design perfectly/cleanly, and have a
window in which to redesign it.  It's humbly suggested eclass devs take
advantage of it. :)


Backwards Compatibility
=======================

All backwards compatibility issues are addressed in line, but a recap is offered-
it's suggested that if the a particular compatibility issue is
questioned/worried over, the reader read the relevant section.  There should be
a more in depth discussion of the issue, along with a more extensive explanation
of the potential solutions, and reasons for the chosen solution.

To recap:
::

	New eclasses and elib functionality will be tied to a specific portage
	version.  A DEPENDs on said portage version should address this for rsync
	users who refuse to upgrade to a portage version that supports the new
	eclasses/elibs and will gradually be unable to merge ebuilds that use said
	functionality.  It is their choice to upgrade, as such, the gradual
	'thinning' of available ebuilds should they block the portage upgrade is
	their responsibility.
	
	Old eclasses at some point in the future should be removed from the tree,
	and released in a tarball/ebuild.  This will cause installed ebuilds that
	rely on the old eclass to be unable to unmerge, with the same applying for 
	merging of binpkgs dependent on the following paragraph.
		
	The old eclass-compat is only required for users who do not upgrade their 
	portage installation, and one further exemption- if the user has somehow
	corrupted/destroyed their installed pkgs database (/var/db/pkg currently), 
	in the process, they've lost their saved environments.  The eclass-compat
	ebuild would be required for ebuilds that required older eclasses in such a 
	case.  Note, this case is rare also- as clarified above, it's mentioned 
	strictly to be complete, it's not much of a real world scenario as elaborated
	above.


Copyright
=========

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
Unported License.  To view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/3.0/.