Non-java resources creating split-package problems?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Non-java resources creating split-package problems?

Stephan Herrmann
Hi,

back then when working on our compiler I didn't pay much attention to non-Java
resources, as JLS doesn't make any mention of them.

Recently, however, we found that plain text files in a modular jar can well
cause the VM to refuse starting, complaining:

java.lang.LayerInstantiationException: Package foo in both module x and module y
        at
java.base/jdk.internal.module.ModuleBootstrap.checkSplitPackages(ModuleBootstrap.java:470)

I would like to report this problem at build time, to avoid bad surprises at
launch time. For this I would like to learn what are the rules how non-Java
resources participate in JLS 7.4.3 ("uniquely visible").

The closest I could find was in the javadoc of
   java.lang.Module.getResourceAsStream(String)
While it doesn't touch the split package issue, it introduces a distinction into
encapsulated and not encapsulated resources, based on a package name (while at
the same time judging by JLS no package exists). I guess this holds part of my
answer but still I couldn't quite connect the remaining dots.

Where is it defined, in which situation non-Java resources contribute to
illegally split packages?

Can a resource "package" conflict with a Java package in another module? Which
of the two needs to be exported / opened in order to create a conflict? Etc...

best,
Stephan
Reply | Threaded
Open this post in threaded view
|

Re: Non-java resources creating split-package problems?

Alan Bateman
On 24/08/2019 21:11, Stephan Herrmann wrote:
> Hi,
>
> back then when working on our compiler I didn't pay much attention to
> non-Java resources, as JLS doesn't make any mention of them.
>
> Recently, however, we found that plain text files in a modular jar can
> well cause the VM to refuse starting,
I think you are mainly asking how to determine the set of packages in a
(named) module.

Automatic modules are straight-forward. The set of packages is
determined from the non-directory entries in the JAR file that end in
".class". The API docs for ModuleFinder.of(Path...) have all the details.

Explicit modules have more to their story. They can be exploded on the
file system or may be packaged into modular JAR or other formats. What
is the set of packages in an explicit module? The Module class file
attribute can help as it contains the set of packages that are exported,
open or contain service provider implementations. However, it doesn't
contain the set of packages that not exported, not opened, or don't
contain service provider implementations. To determine the complete set
of packages in a module will often require scanning the contents of the
module. This can mean scanning the file system (exploded module) or
scanning the contents of a JAR file (modular JAR). This is an area where
the ModuleFinder API docs could say more. When an exploded or modular
JAR is scanned, the entries are mapped to candidate package names. If an
entry maps to a legal package name then it will be added to set of
packages in the module. If you link this to the encapsulation specified
by the Class/Module getResourceXXX methods then it starts to become
clear that non-class resources will be encapsulated if the package isn't
opened to other modules.

As an optimization (to avoid potentially expensive scanning) the
module-info can include a ModulePackages class file attribute that
contains the set of packages in the module. The `jar` tool adds (or
updates) this attribute when creating (or updating) a modular JAR. This
means the set of packages in the module may be determined at packaging
time rather than other phases and the scanning done at packaging time
needs to exactly match the scanning that would be done when the
ModulePackages attribute is not present Again, this is an area where the
ModuleFinder API docs could say more.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: Non-java resources creating split-package problems?

Stephan Herrmann
Thanks Alan,

This looks like quite a party of stakeholders discussing what is or is not a
package. Let me check if I get it right:

(1) JLS requires a package declaration in a compilation unit for a package to exist.

(2) a class file attribute in module-info.class may declare the set of packages.

(3) for automatic modules the rules are effectively similar to (1), provided
that .class files have been placed in the proper folder in the jar.

(4) for explicit modules a new kind of package comes into the picture: packages
created by non-java resources.


As you mention that ModuleFinder could say more about (4), I'd re-phrase it:
currently nothing positively defines this concept.


If the above list is correct (complete?), some questions remain:

- What should happen when (2) declares a set of packages that differs from what
scanning would result in? Can (2), e.g., be used to hide a package?

- Is it possible to export/open a package that has no .class in it? According to
JLS: no!
   - does this imply non-java packages are always encapsulated?

- Can any package from (4) contribute to a conflict viz-a-viz JLS 7.4.3? Due to
lack of export we may have to say: a non-java package in the current module vs.
an exported java package from a read module, is that a conflict? Who should
signal the conflict, if any?

- Is it true that a folder that is not a package never conflicts with a
same-named folder in another module?

- What is the rationale for the fact that adding module-info may result in more
packages (due to difference (3) vs. (4))?

- Where can we read the difference between what JLS allows (including concealed
packages of the same name in different modules) vs. what the boot layer
implementation accepts? Should we assume that the boot layer behaves as if
created by ModuleLayer.defineModulesWithOneLoader()? The latter's javadoc is the
only spec I could find mentioning the problem of "overlapping packages", but
clearly that method cannot create the boot layer due to the restriction
regarding java.base.

- javadoc of Module.getResourceAsStream() has a list of two bullets introduced
by "Whether a resource can be located or not is determined as follows:". The two
bullets may be at conflict, should we assume that the second bullet starts with
"Otherwise"?

- the same javadoc speaks only of unnamed / named modules. But what happens in
an automatic module? Isn't the javadoc saying that an automatic (=named) module
may consider a non-java resource to be in a package, while according to (3)
automatic modules have no non-java packages?


FWIW, the whole thing surfaced because adding module-info to some libraries
triggered the mentioned LayerInstantiationException, the culprit being s.t. like
about_files/EPL.txt appearing in each library. While the exception itself was a
bit of a shock, I was then surprise to see that renaming about_files to
about-files resolves the problem. If that is an intended solution, shouldn't it
be much more advertised? Note that no individual library owner will see any
problem, the problem only arises when trying to combine several unfortunate
libraries.

Stephan

On 26.08.19 10:13, Alan Bateman wrote:

> On 24/08/2019 21:11, Stephan Herrmann wrote:
>> Hi,
>>
>> back then when working on our compiler I didn't pay much attention to non-Java
>> resources, as JLS doesn't make any mention of them.
>>
>> Recently, however, we found that plain text files in a modular jar can well
>> cause the VM to refuse starting,
> I think you are mainly asking how to determine the set of packages in a (named)
> module.
>
> Automatic modules are straight-forward. The set of packages is determined from
> the non-directory entries in the JAR file that end in ".class". The API docs for
> ModuleFinder.of(Path...) have all the details.
>
> Explicit modules have more to their story. They can be exploded on the file
> system or may be packaged into modular JAR or other formats. What is the set of
> packages in an explicit module? The Module class file attribute can help as it
> contains the set of packages that are exported, open or contain service provider
> implementations. However, it doesn't contain the set of packages that not
> exported, not opened, or don't contain service provider implementations. To
> determine the complete set of packages in a module will often require scanning
> the contents of the module. This can mean scanning the file system (exploded
> module) or scanning the contents of a JAR file (modular JAR). This is an area
> where the ModuleFinder API docs could say more. When an exploded or modular JAR
> is scanned, the entries are mapped to candidate package names. If an entry maps
> to a legal package name then it will be added to set of packages in the module.
> If you link this to the encapsulation specified by the Class/Module
> getResourceXXX methods then it starts to become clear that non-class resources
> will be encapsulated if the package isn't opened to other modules.
>
> As an optimization (to avoid potentially expensive scanning) the module-info can
> include a ModulePackages class file attribute that contains the set of packages
> in the module. The `jar` tool adds (or updates) this attribute when creating (or
> updating) a modular JAR. This means the set of packages in the module may be
> determined at packaging time rather than other phases and the scanning done at
> packaging time needs to exactly match the scanning that would be done when the
> ModulePackages attribute is not present Again, this is an area where the
> ModuleFinder API docs could say more.
>
> -Alan

Reply | Threaded
Open this post in threaded view
|

Re: Non-java resources creating split-package problems?

Alan Bateman
On 26/08/2019 22:37, Stephan Herrmann wrote:
> :
>
> If the above list is correct (complete?), some questions remain:
>
> - What should happen when (2) declares a set of packages that differs
> from what scanning would result in? Can (2), e.g., be used to hide a
> package?
The ModulePackages attribute would win in that case, at least in the JDK
implementation. The likely implications would be compilation errors or
NoClassDefFoundErrors at run-time because the classes in that package
would not be visible.

>
> - Is it possible to export/open a package that has no .class in it?
> According to JLS: no!
>   - does this imply non-java packages are always encapsulated?
I think you are looking for JLS 7.7.2.

At run-time, a non-class resource will be encapsulated if the
corresponding package is not open to other modules.

>
> - Can any package from (4) contribute to a conflict viz-a-viz JLS
> 7.4.3? Due to lack of export we may have to say: a non-java package in
> the current module vs. an exported java package from a read module, is
> that a conflict? Who should signal the conflict, if any?
Not at compile-time.

At run-time you cannot map two modules containing the same package to
the same class loader. In your example, it sounds like you have two
explicit modules with package "about_files" so they conflict when mapped
to the application class loader.

> :
>
> - What is the rationale for the fact that adding module-info may
> result in more packages (due to difference (3) vs. (4))?
Non-class resources in explicit modules can be encapsulated. Resources
in automatic modules cannot be encapsulated. The reason that the non
class resources don't contribute to the set of packages in an automatic
module is to maximize the potential for use of existing JAR files as
modules. In your "about_files" case then I assume the two JAR files
would have worked as automatic modules.

>
> - Where can we read the difference between what JLS allows (including
> concealed packages of the same name in different modules) vs. what the
> boot layer implementation accepts? Should we assume that the boot
> layer behaves as if created by
> ModuleLayer.defineModulesWithOneLoader()? The latter's javadoc is the
> only spec I could find mentioning the problem of "overlapping
> packages", but clearly that method cannot create the boot layer due to
> the restriction regarding java.base.
The boot layer is special, think ModuleLayer::defineModules with a
function that maps the modules to the built-in class loaders but all the
restrictions of defineModulesWithOneLoader. You are right that the
paragraph on the boot layer in the ModuleLayer could say more on this.

>
> - the same javadoc speaks only of unnamed / named modules. But what
> happens in an automatic module? Isn't the javadoc saying that an
> automatic (=named) module may consider a non-java resource to be in a
> package, while according to (3) automatic modules have no non-java
> packages?
I don't see an issue here as these resources are not in one of the
module's packages and therefore cannot be encapsulated.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: Non-java resources creating split-package problems?

Stephan Herrmann
Much clearer, still a few comments.

On 27.08.19 09:05, Alan Bateman wrote:
>> - What should happen when (2) declares a set of packages that differs from
>> what scanning would result in? Can (2), e.g., be used to hide a package?
> The ModulePackages attribute would win in that case, at least in the JDK
> implementation. The likely implications would be compilation errors or
> NoClassDefFoundErrors at run-time because the classes in that package would not
> be visible.

For the original issue in this thread this could actually be a solution: hide
non-java packages from the layer implementation. Since those files aren't even
meant to be accessed at runtime I don't expect any harm. Except that we'd be
relying on details of the default implementation.

>> - Is it possible to export/open a package that has no .class in it? According
>> to JLS: no!
>>   - does this imply non-java packages are always encapsulated?
> I think you are looking for JLS 7.7.2.

That's where I was looking and that's why I wondered if those packages are
doomed to be always "encapsulated".

> At run-time, a non-class resource will be encapsulated if the corresponding
> package is not open to other modules.

As you seem to see both open and not-open as possibilities, I guess as a last
resort there's API where the package can still be opened at runtime, despite JLS
7.7.2. Fair enough.

 > In your "about_files" case then I assume the two JAR files would have worked
 > as automatic modules.

yes, they did. Only "improving" them to explicit modules broke clients.


Stepping back, I see one source of confusion in the 3-valued semantics of the
term "encapsulated":

(a) a package can be encapsulated, so JPMS rules prohibit any access from
outside the module

(b) a package can be exported / visible, so JPMS includes it in uniquely-visible
checks.

(c) a folder can be legacy / not-a-package, so JPMS rules don't apply at all.


When packaging files that are irrelevant for access by clients, one might be
tempted to use (a) but when run in a single-classloader runtime this can
unexpectedly blow up. Here, encapsulation does not establish the desired
independence, as common sense suggests.

For the problem at hand I see two possible solutions:
- the "hack" of forging the ModulePackages attribute
- rename the folder to change it from encapsulated (a) to legacy (c).

Both options have some kind of negative smell, so let me ask: which solution is
recommended when trying to group legalese files in a folder inside a jar that
should NEVER be subjected to JPMS-specific conflict checks?

Option 3 (?):
- never run applications in single-classloader runtimes

thanks,
Stephan

PS: Bottom line for tool implementors seems to be: none of these problems should
be reported at build time, just let it crash at runtime?
Reply | Threaded
Open this post in threaded view
|

Re: Non-java resources creating split-package problems?

Alan Bateman
On 27/08/2019 18:09, Stephan Herrmann wrote:

> :
>
> For the problem at hand I see two possible solutions:
> - the "hack" of forging the ModulePackages attribute
> - rename the folder to change it from encapsulated (a) to legacy (c).
>
> Both options have some kind of negative smell, so let me ask: which
> solution is recommended when trying to group legalese files in a
> folder inside a jar that should NEVER be subjected to JPMS-specific
> conflict checks?
The JAR file format doesn't have a standard location for such files.
Also the`jar` tool doesn't have an option to exclude specific locations
when it computes the set of packages. The top-level directory or a
location such as META-INF/legal will work of course.

-Alan