Scanning multi version jars?

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Scanning multi version jars?

Greg Wilkins
I hope this is the right group for this question. please redirect me if not.

The Jetty project is trying to implement annotation scanning for multi
version jars and have some concerns with some edge cases, specifically with
inner classes.

A multi versioned jar might contain something like:

   - org/example/Foo.class
   - org/example/Foo$Bar.class
   - META-INF/versions/9/org/example/Foo.class

It is clear that there is a java 9 version of Foo.  But what is unclear is
the inner class Foo$Bar?  Is that only used by the base Foo version? or
does the java 9 version also use the Foo$Bar inner class, but it didn't use
any java 9 features, so the base version is able to be used??

It is unclear from just an index of the jar if we should be
scanning Foo$Bar for annotations.  So currently it appears that we have to
actually scan the Foo class to see if Foo$Bar is referenced and only then
scan Foo$Bar for annotations (and recursive analysis for any Foo$Bar$Bob
class )!

An alternative would be if there was an enforced convention that any
versioned class would also version all of it's inner classes - which may be
a reasonable assumption given that they would be compiled together, but we
see nothing in the specifications that force a jar to be assembled that way.

Any guidance anybody can give would be helpful.

cheers
Greg Wilkins

PS. The JarFile.getEntry method does not appear to respect it's javadoc
with respect to multiversioned jars: it says it will do a search for the
most recent version, however the code indicates that the search is only
done if the base version does not exist.  This is kind of separate issue,
but makes it difficult to defer the behaviour of what to scan to the
implementation in JarFile






--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

RE: Scanning multi version jars?

Stephen Felts
We ran into this problem, where we have a closed-set class checker and it has a problem processing MR jar files.

I recommend replacing all inner classes if the ordinary class is versioned.  If the inner class goes away, you would need to stub it so a versioned copy exists.  That is the convention we have adopted for our project but  that is not currently a universal convention (and I believe that we saw one MR jar that didn't follow this convention - maybe the new JAXB jar?).

If you use the JDK9 API to scan classes in a jar file for version 9, you will get META-INF/versions/9/org/example/Foo.class  and org/example/Foo$Bar.class.

This is the code that we use to read the classes.

>         // the JDK9 mode.  This is accomplished by supplying the "base version" to the constructor.
>         Method baseVersionMethod = null;
>         try {
>           baseVersionMethod = JarFile.class.getMethod("baseVersion");
>         } catch (NoSuchMethodException nsme) {}
>
>         Object baseVersion = null;
>         if (baseVersionMethod != null) {
>           try {
>             baseVersion = baseVersionMethod.invoke(null);
>           } catch (IllegalAccessException | IllegalArgumentException | InvocationTargetException e) {
>             throw new RuntimeException(e);
>           }
>         }
>
>         Constructor<?> jarFileConstructor = null;
>         if (baseVersion != null) {
>           try {
>             jarFileConstructor = JarFile.class.getConstructor(File.class, boolean.class, int.class, baseVersion.getClass());
>           } catch (NoSuchMethodException | SecurityException e) {
>             throw new RuntimeException(e);
>           }
>         }
>
>         JarFile jarFile;
>         if (jarFileConstructor != null) {
>           try {
>             jarFile = (JarFile) jarFileConstructor.newInstance(new File(filePath), Boolean.TRUE, ZipFile.OPEN_READ, baseVersion);
>           } catch (InvocationTargetException i) {
>             Throwable t = i.getCause();
>             if (t instanceof IOException)
>               throw (IOException) t;
>             throw new RuntimeException(t);
>           } catch (InstantiationException | IllegalAccessException | IllegalArgumentException e) {
>             throw new RuntimeException(e);
>           }
>         } else {
>           jarFile = new JarFile(filePath);
>         }

In our case when processing a MR jar, we assume that if a class is versioned, we will ignore all related classes (ordinary and inner).
We need to keep a list of all classes and not process any of them until we get the full list.
That is, we post process the class list so that we drop ordinary classes and inner classes if there is versioned replacement of any of them.
We avoid false failures by taking this approach (we ignore inner classes for older releases that reference a class that might no longer exist).

We know that this is not purely correct.  It's possible that the versioned replacement class references the non-versioned inner class.
I think that the correct way would be to see if the inner class is referenced by any class file but that isn't doable so you will need to choose a heuristic.

In the case of annotation processing, it seems the heuristic should be to include the non-versioned inner class since it could be referenced by another class somewhere. That should be simple since you don't need to wait to process files.




-----Original Message-----
From: Greg Wilkins [mailto:[hidden email]]
Sent: Wednesday, September 13, 2017 6:12 PM
To: [hidden email]
Subject: Scanning multi version jars?

I hope this is the right group for this question. please redirect me if not.

The Jetty project is trying to implement annotation scanning for multi version jars and have some concerns with some edge cases, specifically with inner classes.

A multi versioned jar might contain something like:

   - org/example/Foo.class
   - org/example/Foo$Bar.class
   - META-INF/versions/9/org/example/Foo.class

It is clear that there is a java 9 version of Foo.  But what is unclear is the inner class Foo$Bar?  Is that only used by the base Foo version? or does the java 9 version also use the Foo$Bar inner class, but it didn't use any java 9 features, so the base version is able to be used??

It is unclear from just an index of the jar if we should be scanning Foo$Bar for annotations.  So currently it appears that we have to actually scan the Foo class to see if Foo$Bar is referenced and only then scan Foo$Bar for annotations (and recursive analysis for any Foo$Bar$Bob class )!

An alternative would be if there was an enforced convention that any versioned class would also version all of it's inner classes - which may be a reasonable assumption given that they would be compiled together, but we see nothing in the specifications that force a jar to be assembled that way.

Any guidance anybody can give would be helpful.

cheers
Greg Wilkins

PS. The JarFile.getEntry method does not appear to respect it's javadoc with respect to multiversioned jars: it says it will do a search for the most recent version, however the code indicates that the search is only done if the base version does not exist.  This is kind of separate issue, but makes it difficult to defer the behaviour of what to scan to the implementation in JarFile






--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Alan Bateman
In reply to this post by Greg Wilkins


On 13/09/2017 23:12, Greg Wilkins wrote:
> I hope this is the right group for this question. please redirect me if not.
Probably core-libs-dev as this isn't anything to do with modules but in
any case ...

>
> The Jetty project is trying to implement annotation scanning for multi
> version jars and have some concerns with some edge cases, specifically with
> inner classes.
>
> A multi versioned jar might contain something like:
>
>     - org/example/Foo.class
>     - org/example/Foo$Bar.class
>     - META-INF/versions/9/org/example/Foo.class
>
> It is clear that there is a java 9 version of Foo.  But what is unclear is
> the inner class Foo$Bar?  Is that only used by the base Foo version? or
> does the java 9 version also use the Foo$Bar inner class, but it didn't use
> any java 9 features, so the base version is able to be used??
>
> It is unclear from just an index of the jar if we should be
> scanning Foo$Bar for annotations.  So currently it appears that we have to
> actually scan the Foo class to see if Foo$Bar is referenced and only then
> scan Foo$Bar for annotations (and recursive analysis for any Foo$Bar$Bob
> class )!
Is Foo$Bar public and part of org.example's API? If so then I would
expect compiling the 9 version of Foo.java will generate a class file
for each of the inner classes and so the scenario shouldn't arise.  If
Foo$Bar is not public (and so not part of org.example's API), and you
scanning non-public classes, then it would need special handling and
examination of the InnerClasses attribute in both classes. I wouldn't
expect it will arise too often to cause a performance issue (assuming
that is the concern).

>
> PS. The JarFile.getEntry method does not appear to respect it's javadoc
> with respect to multiversioned jars: it says it will do a search for the
> most recent version, however the code indicates that the search is only
> done if the base version does not exist.  This is kind of separate issue,
> but makes it difficult to defer the behaviour of what to scan to the
> implementation in JarFile
getEntry/getJarEntry will return a JarEntry if it is present in the base
section or a versioned section (<= max version you specified when
opening the JarFile) or both. I agree the javadoc is a bit confusing and
could be improved but I assume you don't actually have an issue here as
the JarEntry returned will locate the entry in the versioned section if
it exists.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Weijun Wang

> On Sep 14, 2017, at 5:07 PM, Alan Bateman <[hidden email]> wrote:
>
> On 13/09/2017 23:12, Greg Wilkins wrote:
>> I hope this is the right group for this question. please redirect me if not.
> Probably core-libs-dev as this isn't anything to do with modules but in any case ...

A related question:

I know an MR jar allows you to shadow a class file with a release-specific one, but what if the new release has removed an old class? It will not appear in the release-specific directory but still exists in the root. Should we describe this in the MANIFEST?

Thanks
Max

Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Alan Bateman
On 14/09/2017 10:58, Weijun Wang wrote:
> :
> I know an MR jar allows you to shadow a class file with a release-specific one, but what if the new release has removed an old class? It will not appear in the release-specific directory but still exists in the root. Should we describe this in the MANIFEST?
>
A MR JAR is not intended to support multiple versions of the same
library, instead the versioned sections are for classes that take
advantage of newer language or API features. They help with the
migration from using JDK internal APIs to supported/standard APIs for
example. So I don't think it should be complicated by an additional list
of entries to "hide" in the base or overlaid version sections.

Greg's mail doesn't say if Bar is public so I can't tell if his example
involves an attempted API change or not. Assuming Bar is not public then
compiling the 9 version of Foo.java will generate Foo.class and no
Foo$Bar.class. This doesn't mean it's completely orphaned of course as
there may be other classes in the base section, and in the same package,
that were compiled with references to Bar. The `jar` tool could do some
additional validation to catch these references and so avoid
IncompatibleClassChangeError at runtime (as might arise if
getEnclosingClass were invoked on the inner class). That would help with
Greg's annotation scanning scenario too.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
Alan,

thanks for correcting me on the API of JarFile - I can see it kind of
works, but in a very bizarre way (it gives different content for entries
obtained via the enumerator vs the getJarEntry API, even though both
entries report the same name).  But I'll discuss that elsewhere.



The main issue still remains is that it is entirely unclear what files we
should scan.   I understand the nuanced point that you are trying to make,
ie "that it depends"  on if the class is public or private, if it is an API
change, if it is an alternate implementation rather than a new version of
the same library etc. etc.  I also totally understand that there are
intended uses and unintended uses for this feature.

However, as an implementer of an application container, it does not matter
if I understand the nuances of MR jars and intended usage.  What matters is
do the developers of the 3rd party jars that will be deployed in my
container understand those nuances?    We have to look at jars that are
supplied by third parties, with various levels of understanding, perhaps
with some tricky clever ideas how to mess with the system, and we have to
decide which classes we are going to scan for annotations.

This is NOT a performance issue.  It is a consistency/portability issue. We
have to make exactly the same decisions as all the other application
containers out there, else 3rd party library jars will act differently on
different containers.

Thus it looks like we need some kind of heuristic to guess what the 3rd
party developer intended when they used the MR feature.    Some approaches
will need us to scan all the outer and inner classes to determine if the
inner classes are referenced and if they are public or private.

The heuristic could then be to analyse an inner class IFF it is public and
referenced.    Or perhaps that should be if it is public OR referenced?

Alternately, can we just have an heuristic based only on the index.  If Foo
exists as a versioned class, then only similarly versioned Foo$Bar classes
should be scanned and base Foo$Bar classes will be ignored?

All of these are possible.  But we need an official documented (perhaps
tool enforced) policy so that all containers can implement the same
heuristic so that we can have portability.

Ideally, the containers would not need to implement this heuristic, as it
would be implemented in the enumerator of JarFile.  Unfortunately that is
not the case and the enumerator returns all the entries regardless of
version.   So containers must implement their own enumeration and we need
to make sure we all implement it the same!

regards





On 14 September 2017 at 20:44, Alan Bateman <[hidden email]> wrote:

> On 14/09/2017 10:58, Weijun Wang wrote:
>
>> :
>> I know an MR jar allows you to shadow a class file with a
>> release-specific one, but what if the new release has removed an old class?
>> It will not appear in the release-specific directory but still exists in
>> the root. Should we describe this in the MANIFEST?
>>
>> A MR JAR is not intended to support multiple versions of the same
> library, instead the versioned sections are for classes that take advantage
> of newer language or API features. They help with the migration from using
> JDK internal APIs to supported/standard APIs for example. So I don't think
> it should be complicated by an additional list of entries to "hide" in the
> base or overlaid version sections.
>
> Greg's mail doesn't say if Bar is public so I can't tell if his example
> involves an attempted API change or not. Assuming Bar is not public then
> compiling the 9 version of Foo.java will generate Foo.class and no
> Foo$Bar.class. This doesn't mean it's completely orphaned of course as
> there may be other classes in the base section, and in the same package,
> that were compiled with references to Bar. The `jar` tool could do some
> additional validation to catch these references and so avoid
> IncompatibleClassChangeError at runtime (as might arise if
> getEnclosingClass were invoked on the inner class). That would help with
> Greg's annotation scanning scenario too.
>
> -Alan
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Alan Bateman


On 15/09/2017 03:09, Greg Wilkins wrote:
>
> Alan,
>
> thanks for correcting me on the API of JarFile - I can see it kind of
> works, but in a very bizarre way (it gives different content for
> entries obtained via the enumerator vs the getJarEntry API, even
> though both entries report the same name).  But I'll discuss that
> elsewhere.
This is something that was discussed on core-libs-dev on a number of
occasions. The summary is that JarFile needs a new API for this,
versionedStream() was suggested, but it was kicked down the road for
later in order deal with the fallout from adding MR JARs.

Since you have access to the code then look at
jdk.internal.util.jar.VersionedStream for an example code of what I
think you are looking for.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
Alan,

I had a quick look at `jdk.internal.util.jar.VersionedStream` and have the
following comments:

   - The style of the API is fine - pass in a JarFile and get a
   Stream<JarEntry>.
   - It might be better to have a Stream<VersionedJarEntry> which includes
   a method to query the actual version of each entry.
   - I think the stream needs to handle inner classes and only include them
   if their matching outerclass is available at the same version.  So for
   example a base Foo$Bar.class will only be included if the stream includes a
   base Foo.class, and it will not be included if the Foo.class is version 9
   or above.  Likewise a version 9 Foo$Bar.class will only be included in the
   stream if the stream also includes a version 9 Foo.class, and will not  be
   included if the stream has a version 10 or above Foo.class

If you think this last point is possible, then I'll move the discussion
back the EE expert groups to try to get an agreement on the exact stream
code that will be used in the mid term until it is available in the JRE
lib, at which time the specs should be amended to say they will defer the
decision of which classes to scan the JRE lib so they will be future proof
for any changes in java 10, 11 etc.

cheers




On 15 September 2017 at 17:27, Alan Bateman <[hidden email]> wrote:

>
>
> On 15/09/2017 03:09, Greg Wilkins wrote:
>
>>
>> Alan,
>>
>> thanks for correcting me on the API of JarFile - I can see it kind of
>> works, but in a very bizarre way (it gives different content for entries
>> obtained via the enumerator vs the getJarEntry API, even though both
>> entries report the same name).  But I'll discuss that elsewhere.
>>
> This is something that was discussed on core-libs-dev on a number of
> occasions. The summary is that JarFile needs a new API for this,
> versionedStream() was suggested, but it was kicked down the road for later
> in order deal with the fallout from adding MR JARs.
>
> Since you have access to the code then look at
> jdk.internal.util.jar.VersionedStream for an example code of what I think
> you are looking for.
>
> -Alan
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

RE: Scanning multi version jars?

Stephen Felts
FWIW I tracked down the MR jar file that I was having trouble with.  It's the stand-alone JAXWS jar file com.sun.xml.ws.jaxws-rt.jar.
Focusing on the problem class, the jar contains
com/sun/xml/ws/util/xml/XmlCatalogUtil$1.class
com/sun/xml/ws/util/xml/XmlCatalogUtil.class
META-INF/versions/9/com/sun/xml/ws/util/xml/XmlCatalogUtil.class

The inner class XmlCatalogUtil$1.class is not used by the JDK9 version of the outer class.  Further, XmlCatalogUtil$1.class has class/method references that will not be resolved on JDK9.
If the stream includes the inner class, it's possible that whatever is processing it will fail.

In the use case that I had for processing jar files, we generated a collection of all class names in the jar file, and then removed class file names as Greg proposed.
In the above example, com/sun/xml/ws/util/xml/XmlCatalogUtil$1.class and com/sun/xml/ws/util/xml/XmlCatalogUtil.class are removed.
Processing of the class files can only proceed after we generate/modify the entire list for the jar.
It ignores the possibility that a class outside the jar could be referencing something in the inner class that is public.


-----Original Message-----
From: Greg Wilkins [mailto:[hidden email]]
Sent: Friday, September 15, 2017 5:59 PM
To: Alan Bateman <[hidden email]>
Cc: jigsaw-dev <[hidden email]>; [hidden email]
Subject: Re: Scanning multi version jars?

Alan,

I had a quick look at `jdk.internal.util.jar.VersionedStream` and have the following comments:

   - The style of the API is fine - pass in a JarFile and get a
   Stream<JarEntry>.
   - It might be better to have a Stream<VersionedJarEntry> which includes
   a method to query the actual version of each entry.
   - I think the stream needs to handle inner classes and only include them
   if their matching outerclass is available at the same version.  So for
   example a base Foo$Bar.class will only be included if the stream includes a
   base Foo.class, and it will not be included if the Foo.class is version 9
   or above.  Likewise a version 9 Foo$Bar.class will only be included in the
   stream if the stream also includes a version 9 Foo.class, and will not  be
   included if the stream has a version 10 or above Foo.class

If you think this last point is possible, then I'll move the discussion back the EE expert groups to try to get an agreement on the exact stream code that will be used in the mid term until it is available in the JRE lib, at which time the specs should be amended to say they will defer the decision of which classes to scan the JRE lib so they will be future proof for any changes in java 10, 11 etc.

cheers




On 15 September 2017 at 17:27, Alan Bateman <[hidden email]> wrote:

>
>
> On 15/09/2017 03:09, Greg Wilkins wrote:
>
>>
>> Alan,
>>
>> thanks for correcting me on the API of JarFile - I can see it kind of
>> works, but in a very bizarre way (it gives different content for
>> entries obtained via the enumerator vs the getJarEntry API, even
>> though both entries report the same name).  But I'll discuss that elsewhere.
>>
> This is something that was discussed on core-libs-dev on a number of
> occasions. The summary is that JarFile needs a new API for this,
> versionedStream() was suggested, but it was kicked down the road for
> later in order deal with the fallout from adding MR JARs.
>
> Since you have access to the code then look at
> jdk.internal.util.jar.VersionedStream for an example code of what I
> think you are looking for.
>
> -Alan
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
In reply to this post by Greg Wilkins
Alen et al,

here is a VersionedJarFile implementation that filters out inappropriate
inner classes:

https://gist.github.com/gregw/8f305e369d0b769e9c3fe791a0634b13

cheers

On 16 September 2017 at 07:58, Greg Wilkins <[hidden email]> wrote:

>
> Alan,
>
> I had a quick look at `jdk.internal.util.jar.VersionedStream` and have
> the following comments:
>
>    - The style of the API is fine - pass in a JarFile and get a
>    Stream<JarEntry>.
>    - It might be better to have a Stream<VersionedJarEntry> which
>    includes a method to query the actual version of each entry.
>    - I think the stream needs to handle inner classes and only include
>    them if their matching outerclass is available at the same version.  So for
>    example a base Foo$Bar.class will only be included if the stream includes a
>    base Foo.class, and it will not be included if the Foo.class is version 9
>    or above.  Likewise a version 9 Foo$Bar.class will only be included in the
>    stream if the stream also includes a version 9 Foo.class, and will not  be
>    included if the stream has a version 10 or above Foo.class
>
> If you think this last point is possible, then I'll move the discussion
> back the EE expert groups to try to get an agreement on the exact stream
> code that will be used in the mid term until it is available in the JRE
> lib, at which time the specs should be amended to say they will defer the
> decision of which classes to scan the JRE lib so they will be future proof
> for any changes in java 10, 11 etc.
>
> cheers
>
>
>
>
> On 15 September 2017 at 17:27, Alan Bateman <[hidden email]>
> wrote:
>
>>
>>
>> On 15/09/2017 03:09, Greg Wilkins wrote:
>>
>>>
>>> Alan,
>>>
>>> thanks for correcting me on the API of JarFile - I can see it kind of
>>> works, but in a very bizarre way (it gives different content for entries
>>> obtained via the enumerator vs the getJarEntry API, even though both
>>> entries report the same name).  But I'll discuss that elsewhere.
>>>
>> This is something that was discussed on core-libs-dev on a number of
>> occasions. The summary is that JarFile needs a new API for this,
>> versionedStream() was suggested, but it was kicked down the road for later
>> in order deal with the fallout from adding MR JARs.
>>
>> Since you have access to the code then look at
>> jdk.internal.util.jar.VersionedStream for an example code of what I
>> think you are looking for.
>>
>> -Alan
>>
>
>
>
> --
> Greg Wilkins <[hidden email]> CTO http://webtide.com
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Alan Bateman
In reply to this post by Greg Wilkins
On 15/09/2017 22:58, Greg Wilkins wrote:

> :
>
>   * I think the stream needs to handle inner classes and only include
>     them if their matching outerclass is available at the same
>     version.  So for example a base Foo$Bar.class will only be
>     included if the stream includes a base Foo.class, and it will not
>     be included if the Foo.class is version 9 or above.  Likewise a
>     version 9 Foo$Bar.class will only be included in the stream if the
>     stream also includes a version 9 Foo.class, and will not be
>     included if the stream has a version 10 or above Foo.class
>
> If you think this last point is possible, then I'll move the
> discussion back the EE expert groups to try to get an agreement on the
> exact stream code that will be used in the mid term until it is
> available in the JRE lib, at which time the specs should be amended to
> say they will defer the decision of which classes to scan the JRE lib
> so they will be future proof for any changes in java 10, 11 etc.
>
I don't think this should be pushed down to the JarFile API. The JarFile
API provides the base API for accessing JAR files and should not be
concerned with the semantics or relationship between entries. I agree
that annotation scanning tools and libraries need to do additional work
to deal with orphaned or menacing inner classes in a MR JAR but it's not
too different to arranging a class path with a JAR file containing the
"classes for JDK 9" ahead of a JAR file containing the version of the
library that runs on JDK 8. I do think that further checks could be done
by the `jar` tool to identify issues at packaging time.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
Alan,

I can sympathise somewhat with that point of view, however the counter
point is that the semantics of MR jars is something that has come from a
new feature released by the JVM that was perhaps a little under specified
with regards to inner classes - probably because the JVM itself does not
need to interpret that semantic when classloader and it is only an issue
when scanning is involved.

Now that the feature has been released and is being used, it is a bit of a
tall ask to expect the disparate tool vendors, spec groups, container
implementors and general developers to come up with a common semantic
interpretation of MR jars that contain inner classes.  Specially with the
EE spec groups a bit pre-occupied with their move to eclipse.

I think my own interpretation is common sense and I'm advocating it to be
adopted by the servlet specification. But who is to say that the CDI groups
might disagree and come up with another interpretation (it's not like the
two groups can even decide on how to interpret @Resource the same way :)

If core-libs were to signal their intention to provide an implementation of
a particular semantic of inner classes in MR jars, then that would be a
great unifying action that would guide the disparate groups to a common
interpretation of the semantic  that was missing from the original
specification.

regards







On 18 September 2017 at 05:27, Alan Bateman <[hidden email]> wrote:

> On 15/09/2017 22:58, Greg Wilkins wrote:
>
> :
>
>    - I think the stream needs to handle inner classes and only include
>    them if their matching outerclass is available at the same version.  So for
>    example a base Foo$Bar.class will only be included if the stream includes a
>    base Foo.class, and it will not be included if the Foo.class is version 9
>    or above.  Likewise a version 9 Foo$Bar.class will only be included in the
>    stream if the stream also includes a version 9 Foo.class, and will not  be
>    included if the stream has a version 10 or above Foo.class
>
> If you think this last point is possible, then I'll move the discussion
> back the EE expert groups to try to get an agreement on the exact stream
> code that will be used in the mid term until it is available in the JRE
> lib, at which time the specs should be amended to say they will defer the
> decision of which classes to scan the JRE lib so they will be future proof
> for any changes in java 10, 11 etc.
>
> I don't think this should be pushed down to the JarFile API. The JarFile
> API provides the base API for accessing JAR files and should not be
> concerned with the semantics or relationship between entries. I agree that
> annotation scanning tools and libraries need to do additional work to deal
> with orphaned or menacing inner classes in a MR JAR but it's not too
> different to arranging a class path with a JAR file containing the "classes
> for JDK 9" ahead of a JAR file containing the version of the library that
> runs on JDK 8. I do think that further checks could be done by the `jar`
> tool to identify issues at packaging time.
>
> -Alan
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Paul Sandoz
In reply to this post by Alan Bateman
I agree with Alan here, we should not be pushing a semantic understanding of inner classes into JarFile.

I do sympathise with the case of annotation class scanning, which has always tunnelled through the class loader view to directly get at class file bytes possibly dealing with various URI schemes, since that is currently the only effective way of accessing the required information in an efficient manner.

As Alan mentioned we should add a traversable versioned view of a JarFile, returning a Stream, from which it should be possible to filter according to certain semantics.

Paul.


> On 17 Sep 2017, at 12:27, Alan Bateman <[hidden email]> wrote:
>
> On 15/09/2017 22:58, Greg Wilkins wrote:
>> :
>>
>>  * I think the stream needs to handle inner classes and only include
>>    them if their matching outerclass is available at the same
>>    version.  So for example a base Foo$Bar.class will only be
>>    included if the stream includes a base Foo.class, and it will not
>>    be included if the Foo.class is version 9 or above.  Likewise a
>>    version 9 Foo$Bar.class will only be included in the stream if the
>>    stream also includes a version 9 Foo.class, and will not be
>>    included if the stream has a version 10 or above Foo.class
>>
>> If you think this last point is possible, then I'll move the discussion back the EE expert groups to try to get an agreement on the exact stream code that will be used in the mid term until it is available in the JRE lib, at which time the specs should be amended to say they will defer the decision of which classes to scan the JRE lib so they will be future proof for any changes in java 10, 11 etc.
>>
> I don't think this should be pushed down to the JarFile API. The JarFile API provides the base API for accessing JAR files and should not be concerned with the semantics or relationship between entries. I agree that annotation scanning tools and libraries need to do additional work to deal with orphaned or menacing inner classes in a MR JAR but it's not too different to arranging a class path with a JAR file containing the "classes for JDK 9" ahead of a JAR file containing the version of the library that runs on JDK 8. I do think that further checks could be done by the `jar` tool to identify issues at packaging time.
>
> -Alan

Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
Paul,

yeh... I guess I concede it's not JarFiles job... as much as that would
make things easier for containers to reach agreement:(

However, can we at least look at having a new default method on JarEntry to
query the version. Without that, containers don't have the information
available to perform the semantic filtering required and thus will not be
able to use the stream API and will have to work from an unversioned stream.

regards

On 19 September 2017 at 03:04, Paul Sandoz <[hidden email]> wrote:

> I agree with Alan here, we should not be pushing a semantic understanding
> of inner classes into JarFile.
>
> I do sympathise with the case of annotation class scanning, which has
> always tunnelled through the class loader view to directly get at class
> file bytes possibly dealing with various URI schemes, since that is
> currently the only effective way of accessing the required information in
> an efficient manner.
>
> As Alan mentioned we should add a traversable versioned view of a JarFile,
> returning a Stream, from which it should be possible to filter according to
> certain semantics.
>
> Paul.
>
>
> > On 17 Sep 2017, at 12:27, Alan Bateman <[hidden email]> wrote:
> >
> > On 15/09/2017 22:58, Greg Wilkins wrote:
> >> :
> >>
> >>  * I think the stream needs to handle inner classes and only include
> >>    them if their matching outerclass is available at the same
> >>    version.  So for example a base Foo$Bar.class will only be
> >>    included if the stream includes a base Foo.class, and it will not
> >>    be included if the Foo.class is version 9 or above.  Likewise a
> >>    version 9 Foo$Bar.class will only be included in the stream if the
> >>    stream also includes a version 9 Foo.class, and will not be
> >>    included if the stream has a version 10 or above Foo.class
> >>
> >> If you think this last point is possible, then I'll move the discussion
> back the EE expert groups to try to get an agreement on the exact stream
> code that will be used in the mid term until it is available in the JRE
> lib, at which time the specs should be amended to say they will defer the
> decision of which classes to scan the JRE lib so they will be future proof
> for any changes in java 10, 11 etc.
> >>
> > I don't think this should be pushed down to the JarFile API. The JarFile
> API provides the base API for accessing JAR files and should not be
> concerned with the semantics or relationship between entries. I agree that
> annotation scanning tools and libraries need to do additional work to deal
> with orphaned or menacing inner classes in a MR JAR but it's not too
> different to arranging a class path with a JAR file containing the "classes
> for JDK 9" ahead of a JAR file containing the version of the library that
> runs on JDK 8. I do think that further checks could be done by the `jar`
> tool to identify issues at packaging time.
> >
> > -Alan
>
>


--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

RE: Scanning multi version jars?

Stephen Felts
A versioned file name, JarEntry.getName(), starts with "META-INF/versions/".
The version is the following string up to the next "/".
The version can be parsed with Runtime.Version.parse().
If not a versioned class file name, then use Jarfile.baseVersion().
That should be sufficient to get the version for any JarEntry.

If it needs to run on pre-JDK9, this needs a lot of reflection.

IMO Having a method that behaves as described below is likely to be needed for many use cases and it would be good if someone wrote it and put it in a well-known, public jar file.



-----Original Message-----
From: Greg Wilkins [mailto:[hidden email]]
Sent: Monday, September 18, 2017 8:19 PM
To: Paul Sandoz <[hidden email]>
Cc: jigsaw-dev <[hidden email]>; [hidden email]
Subject: Re: Scanning multi version jars?

Paul,

yeh... I guess I concede it's not JarFiles job... as much as that would make things easier for containers to reach agreement:(

However, can we at least look at having a new default method on JarEntry to query the version. Without that, containers don't have the information available to perform the semantic filtering required and thus will not be able to use the stream API and will have to work from an unversioned stream.

regards

On 19 September 2017 at 03:04, Paul Sandoz <[hidden email]> wrote:

> I agree with Alan here, we should not be pushing a semantic
> understanding of inner classes into JarFile.
>
> I do sympathise with the case of annotation class scanning, which has
> always tunnelled through the class loader view to directly get at
> class file bytes possibly dealing with various URI schemes, since that
> is currently the only effective way of accessing the required
> information in an efficient manner.
>
> As Alan mentioned we should add a traversable versioned view of a
> JarFile, returning a Stream, from which it should be possible to
> filter according to certain semantics.
>
> Paul.
>
>
> > On 17 Sep 2017, at 12:27, Alan Bateman <[hidden email]> wrote:
> >
> > On 15/09/2017 22:58, Greg Wilkins wrote:
> >> :
> >>
> >>  * I think the stream needs to handle inner classes and only include
> >>    them if their matching outerclass is available at the same
> >>    version.  So for example a base Foo$Bar.class will only be
> >>    included if the stream includes a base Foo.class, and it will not
> >>    be included if the Foo.class is version 9 or above.  Likewise a
> >>    version 9 Foo$Bar.class will only be included in the stream if the
> >>    stream also includes a version 9 Foo.class, and will not be
> >>    included if the stream has a version 10 or above Foo.class
> >>
> >> If you think this last point is possible, then I'll move the
> >> discussion
> back the EE expert groups to try to get an agreement on the exact
> stream code that will be used in the mid term until it is available in
> the JRE lib, at which time the specs should be amended to say they
> will defer the decision of which classes to scan the JRE lib so they
> will be future proof for any changes in java 10, 11 etc.
> >>
> > I don't think this should be pushed down to the JarFile API. The
> > JarFile
> API provides the base API for accessing JAR files and should not be
> concerned with the semantics or relationship between entries. I agree
> that annotation scanning tools and libraries need to do additional
> work to deal with orphaned or menacing inner classes in a MR JAR but
> it's not too different to arranging a class path with a JAR file
> containing the "classes for JDK 9" ahead of a JAR file containing the
> version of the library that runs on JDK 8. I do think that further
> checks could be done by the `jar` tool to identify issues at packaging time.
> >
> > -Alan
>
>


--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
Stephen,

It is not the case that the getName() always returns the path starting with
"META-INF/versions/". Specifically, if the entry is obtained from
getJarEntry() API (and not from the enumerator), then the name is that of
the unversioned file, but the metadata and contents obtained using the
jarEntry are for the versioned entry.

For example the following code:


JarFile jarFile = new JarFile(new File("/tmp/example.jar"),
                              false,
                              JarFile.OPEN_READ,
                              Runtime.version());
JarEntry entry = jarFile.getJarEntry("org/example/OnlyIn9.class");
System.err.printf("%s ->
%s%n",entry.getName(),IO.toString(jarFile.getInputStream(entry)));

when run against a jar where the class files contain just the text of their
full path produces the following output:

org/example/OnlyIn9.class -> META-INF/versions/9/org/example/OnlyIn9.class


There is nothing in the public API of the JarEntry so obtained that
indicates that it the versioned entry, nor can I distinguish it from an
entry obtained by iteration that may report the same name (if the entry was
also in the base), although at least equals does return false.

Moreover, the proposed stream API as represented by the current
implementation of jdk.internal.util.jar.VersionedStream, applies some
filtering based on the versioning and then converts it's enumerated
JarEntry instances to opaquely versioned JarEntry instances by calling
map(jf::getJarEntry),which thus hides the version information and makes any
additional filtering based on version impossible by any users of that
stream.

regards


On 19 September 2017 at 11:05, Stephen Felts <[hidden email]>
wrote:

> A versioned file name, JarEntry.getName(), starts with
> "META-INF/versions/".
> The version is the following string up to the next "/".
> The version can be parsed with Runtime.Version.parse().
> If not a versioned class file name, then use Jarfile.baseVersion().
> That should be sufficient to get the version for any JarEntry.
>
> If it needs to run on pre-JDK9, this needs a lot of reflection.
>
> IMO Having a method that behaves as described below is likely to be needed
> for many use cases and it would be good if someone wrote it and put it in a
> well-known, public jar file.
>
>
>
> -----Original Message-----
> From: Greg Wilkins [mailto:[hidden email]]
> Sent: Monday, September 18, 2017 8:19 PM
> To: Paul Sandoz <[hidden email]>
> Cc: jigsaw-dev <[hidden email]>;
> [hidden email]
> Subject: Re: Scanning multi version jars?
>
> Paul,
>
> yeh... I guess I concede it's not JarFiles job... as much as that would
> make things easier for containers to reach agreement:(
>
> However, can we at least look at having a new default method on JarEntry
> to query the version. Without that, containers don't have the information
> available to perform the semantic filtering required and thus will not be
> able to use the stream API and will have to work from an unversioned stream.
>
> regards
>
> On 19 September 2017 at 03:04, Paul Sandoz <[hidden email]> wrote:
>
> > I agree with Alan here, we should not be pushing a semantic
> > understanding of inner classes into JarFile.
> >
> > I do sympathise with the case of annotation class scanning, which has
> > always tunnelled through the class loader view to directly get at
> > class file bytes possibly dealing with various URI schemes, since that
> > is currently the only effective way of accessing the required
> > information in an efficient manner.
> >
> > As Alan mentioned we should add a traversable versioned view of a
> > JarFile, returning a Stream, from which it should be possible to
> > filter according to certain semantics.
> >
> > Paul.
> >
> >
> > > On 17 Sep 2017, at 12:27, Alan Bateman <[hidden email]>
> wrote:
> > >
> > > On 15/09/2017 22:58, Greg Wilkins wrote:
> > >> :
> > >>
> > >>  * I think the stream needs to handle inner classes and only include
> > >>    them if their matching outerclass is available at the same
> > >>    version.  So for example a base Foo$Bar.class will only be
> > >>    included if the stream includes a base Foo.class, and it will not
> > >>    be included if the Foo.class is version 9 or above.  Likewise a
> > >>    version 9 Foo$Bar.class will only be included in the stream if the
> > >>    stream also includes a version 9 Foo.class, and will not be
> > >>    included if the stream has a version 10 or above Foo.class
> > >>
> > >> If you think this last point is possible, then I'll move the
> > >> discussion
> > back the EE expert groups to try to get an agreement on the exact
> > stream code that will be used in the mid term until it is available in
> > the JRE lib, at which time the specs should be amended to say they
> > will defer the decision of which classes to scan the JRE lib so they
> > will be future proof for any changes in java 10, 11 etc.
> > >>
> > > I don't think this should be pushed down to the JarFile API. The
> > > JarFile
> > API provides the base API for accessing JAR files and should not be
> > concerned with the semantics or relationship between entries. I agree
> > that annotation scanning tools and libraries need to do additional
> > work to deal with orphaned or menacing inner classes in a MR JAR but
> > it's not too different to arranging a class path with a JAR file
> > containing the "classes for JDK 9" ahead of a JAR file containing the
> > version of the library that runs on JDK 8. I do think that further
> > checks could be done by the `jar` tool to identify issues at packaging
> time.
> > >
> > > -Alan
> >
> >
>
>
> --
> Greg Wilkins <[hidden email]> CTO http://webtide.com
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

RE: Scanning multi version jars?

Stephen Felts
Thanks for the clarification – I overstated the “any JarEntry”.

I didn’t look at VersionedStream so I now understand the limitations you mention.

 

In my case, it’s necessary to look at all files in the jar file to do the elimination of unneeded ordinary/inner classes so JarInputStream getNextJarEntry() can be used.

By using the versioned JarFile constructor, getting the JarEntry returns the right one for processing.  If I needed further filtering on the file names, I’d need to return the real file names.

 

Maybe the use case isn’t so universal or well defined.

 

                                                                                                              

 

From: Greg Wilkins [mailto:[hidden email]]
Sent: Monday, September 18, 2017 9:33 PM
To: Stephen Felts <[hidden email]>
Cc: Paul Sandoz <[hidden email]>; jigsaw-dev <[hidden email]>; [hidden email]
Subject: Re: Scanning multi version jars?

 

Stephen,

 

It is not the case that the getName() always returns the path starting with "META-INF/versions/". Specifically, if the entry is obtained from getJarEntry() API (and not from the enumerator), then the name is that of the unversioned file, but the metadata and contents obtained using the jarEntry are for the versioned entry.

 

For example the following code:

 

JarFile jarFile = new JarFile(new File("/tmp/example.jar"),
                              false,
                              JarFile.OPEN_READ,
                              Runtime.version());
JarEntry entry = jarFile.getJarEntry("org/example/OnlyIn9.class");
System.err.printf("%s -> %s%n",entry.getName(),IO.toString(jarFile.getInputStream(entry)));

when run against a jar where the class files contain just the text of their full path produces the following output:

 

org/example/OnlyIn9.class -> META-INF/versions/9/org/example/OnlyIn9.class

 

There is nothing in the public API of the JarEntry so obtained that indicates that it the versioned entry, nor can I distinguish it from an entry obtained by iteration that may report the same name (if the entry was also in the base), although at least equals does return false.

 

Moreover, the proposed stream API as represented by the current implementation of jdk.internal.util.jar.VersionedStream, applies some filtering based on the versioning and then converts it's enumerated JarEntry instances to opaquely versioned JarEntry instances by calling map(jf::getJarEntry),which thus hides the version information and makes any additional filtering based on version impossible by any users of that stream.

 

regards

 

 

On 19 September 2017 at 11:05, Stephen Felts <HYPERLINK "mailto:[hidden email]"[hidden email]> wrote:

A versioned file name, JarEntry.getName(), starts with "META-INF/versions/".
The version is the following string up to the next "/".
The version can be parsed with Runtime.Version.parse().
If not a versioned class file name, then use Jarfile.baseVersion().
That should be sufficient to get the version for any JarEntry.

If it needs to run on pre-JDK9, this needs a lot of reflection.

IMO Having a method that behaves as described below is likely to be needed for many use cases and it would be good if someone wrote it and put it in a well-known, public jar file.



-----Original Message-----
From: Greg Wilkins [mailto:HYPERLINK "mailto:[hidden email]"[hidden email]]
Sent: Monday, September 18, 2017 8:19 PM
To: Paul Sandoz <HYPERLINK "mailto:[hidden email]"[hidden email]>
Cc: jigsaw-dev <HYPERLINK "mailto:[hidden email]"[hidden email]>; HYPERLINK "mailto:[hidden email]"[hidden email]
Subject: Re: Scanning multi version jars?

Paul,

yeh... I guess I concede it's not JarFiles job... as much as that would make things easier for containers to reach agreement:(

However, can we at least look at having a new default method on JarEntry to query the version. Without that, containers don't have the information available to perform the semantic filtering required and thus will not be able to use the stream API and will have to work from an unversioned stream.

regards

On 19 September 2017 at 03:04, Paul Sandoz <HYPERLINK "mailto:[hidden email]"[hidden email]> wrote:

> I agree with Alan here, we should not be pushing a semantic
> understanding of inner classes into JarFile.
>
> I do sympathise with the case of annotation class scanning, which has
> always tunnelled through the class loader view to directly get at
> class file bytes possibly dealing with various URI schemes, since that
> is currently the only effective way of accessing the required
> information in an efficient manner.
>
> As Alan mentioned we should add a traversable versioned view of a
> JarFile, returning a Stream, from which it should be possible to
> filter according to certain semantics.
>
> Paul.
>
>
> > On 17 Sep 2017, at 12:27, Alan Bateman <HYPERLINK "mailto:[hidden email]"[hidden email]> wrote:
> >
> > On 15/09/2017 22:58, Greg Wilkins wrote:
> >> :
> >>
> >>  * I think the stream needs to handle inner classes and only include
> >>    them if their matching outerclass is available at the same
> >>    version.  So for example a base Foo$Bar.class will only be
> >>    included if the stream includes a base Foo.class, and it will not
> >>    be included if the Foo.class is version 9 or above.  Likewise a
> >>    version 9 Foo$Bar.class will only be included in the stream if the
> >>    stream also includes a version 9 Foo.class, and will not be
> >>    included if the stream has a version 10 or above Foo.class
> >>
> >> If you think this last point is possible, then I'll move the
> >> discussion
> back the EE expert groups to try to get an agreement on the exact
> stream code that will be used in the mid term until it is available in
> the JRE lib, at which time the specs should be amended to say they
> will defer the decision of which classes to scan the JRE lib so they
> will be future proof for any changes in java 10, 11 etc.
> >>
> > I don't think this should be pushed down to the JarFile API. The
> > JarFile
> API provides the base API for accessing JAR files and should not be
> concerned with the semantics or relationship between entries. I agree
> that annotation scanning tools and libraries need to do additional
> work to deal with orphaned or menacing inner classes in a MR JAR but
> it's not too different to arranging a class path with a JAR file
> containing the "classes for JDK 9" ahead of a JAR file containing the
> version of the library that runs on JDK 8. I do think that further
> checks could be done by the `jar` tool to identify issues at packaging time.
> >
> > -Alan
>
>


--
Greg Wilkins <HYPERLINK "mailto:[hidden email]"[hidden email]> CTO http://webtide.com





 

--

Greg Wilkins <HYPERLINK "mailto:[hidden email]"[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Greg Wilkins
Stephen,

I think the use-case can be pretty well defined.

There should be an enumeration/iterator/stream available that provides the
contents of a jar file as it would be seen/interpreted by the JVMs
classloader.    So if the classloader is doing any processing to handle
versioned classes/resources, then we need an iterator that implements the
exact same logic.

Which raises an interesting point....   with the multi versioned jar I have
used as an example, which contains:

   - org/example/Foo.class
   - org/example/Foo$Bar.class
   - META-INF/versions/9/org/example/Foo.class

What does the classloader do when asked to load "org.example.Foo$Bar" ?
 If it loads it OK, then the JarFile enumerator/iterator/stream should also
return it.   If it throws a ClassNotFoundException, then the
JarFile enumerator/iterator/stream should skip it.

Currently the classloader will happily return a resource for a base inner
class even if its outerclass does not refer to it, so that suggests that
the iteration should also not process out the inappropriate inner classes.
However I think it could be argued that the loader should not load it.

Eitherway, there should be an iteration available that is entirely
consistent with what the classloader does.

regards




On 19 September 2017 at 13:41, Stephen Felts <[hidden email]>
wrote:

> Thanks for the clarification – I overstated the “any JarEntry”.
>
> I didn’t look at VersionedStream so I now understand the limitations you
> mention.
>
>
>
> In my case, it’s necessary to look at all files in the jar file to do the
> elimination of unneeded ordinary/inner classes so JarInputStream getNextJarEntry()
> can be used.
>
> By using the versioned JarFile constructor, getting the JarEntry returns
> the right one for processing.  If I needed further filtering on the file
> names, I’d need to return the real file names.
>
>
>
> Maybe the use case isn’t so universal or well defined.
>
>
>
>
>
>
>
>
> *From:* Greg Wilkins [mailto:[hidden email]]
> *Sent:* Monday, September 18, 2017 9:33 PM
> *To:* Stephen Felts <[hidden email]>
> *Cc:* Paul Sandoz <[hidden email]>; jigsaw-dev <
> [hidden email]>; [hidden email]
>
> *Subject:* Re: Scanning multi version jars?
>
>
>
> Stephen,
>
>
>
> It is not the case that the getName() always returns the path starting
> with "META-INF/versions/". Specifically, if the entry is obtained from
> getJarEntry() API (and not from the enumerator), then the name is that of
> the unversioned file, but the metadata and contents obtained using the
> jarEntry are for the versioned entry.
>
>
>
> For example the following code:
>
>
>
> JarFile jarFile = new JarFile(new File("/tmp/example.jar"),
>                               false,
>                               JarFile.OPEN_READ,
>                               Runtime.version());
> JarEntry entry = jarFile.getJarEntry("org/example/OnlyIn9.class");
> System.err.printf("%s -> %s%n",entry.getName(),IO.toString(jarFile.
> getInputStream(entry)));
>
> when run against a jar where the class files contain just the text of
> their full path produces the following output:
>
>
>
> org/example/OnlyIn9.class -> META-INF/versions/9/org/example/OnlyIn9.class
>
>
>
> There is nothing in the public API of the JarEntry so obtained that
> indicates that it the versioned entry, nor can I distinguish it from an
> entry obtained by iteration that may report the same name (if the entry was
> also in the base), although at least equals does return false.
>
>
>
> Moreover, the proposed stream API as represented by the current
> implementation of jdk.internal.util.jar.VersionedStream, applies some
> filtering based on the versioning and then converts it's enumerated
> JarEntry instances to opaquely versioned JarEntry instances by calling
> map(jf::getJarEntry),which thus hides the version information and makes
> any additional filtering based on version impossible by any users of that
> stream.
>
>
>
> regards
>
>
>
>
>
> On 19 September 2017 at 11:05, Stephen Felts <[hidden email]>
> wrote:
>
> A versioned file name, JarEntry.getName(), starts with
> "META-INF/versions/".
> The version is the following string up to the next "/".
> The version can be parsed with Runtime.Version.parse().
> If not a versioned class file name, then use Jarfile.baseVersion().
> That should be sufficient to get the version for any JarEntry.
>
> If it needs to run on pre-JDK9, this needs a lot of reflection.
>
> IMO Having a method that behaves as described below is likely to be needed
> for many use cases and it would be good if someone wrote it and put it in a
> well-known, public jar file.
>
>
>
> -----Original Message-----
> From: Greg Wilkins [mailto:[hidden email]]
> Sent: Monday, September 18, 2017 8:19 PM
> To: Paul Sandoz <[hidden email]>
> Cc: jigsaw-dev <[hidden email]>;
> [hidden email]
> Subject: Re: Scanning multi version jars?
>
> Paul,
>
> yeh... I guess I concede it's not JarFiles job... as much as that would
> make things easier for containers to reach agreement:(
>
> However, can we at least look at having a new default method on JarEntry
> to query the version. Without that, containers don't have the information
> available to perform the semantic filtering required and thus will not be
> able to use the stream API and will have to work from an unversioned stream.
>
> regards
>
> On 19 September 2017 at 03:04, Paul Sandoz <[hidden email]> wrote:
>
> > I agree with Alan here, we should not be pushing a semantic
> > understanding of inner classes into JarFile.
> >
> > I do sympathise with the case of annotation class scanning, which has
> > always tunnelled through the class loader view to directly get at
> > class file bytes possibly dealing with various URI schemes, since that
> > is currently the only effective way of accessing the required
> > information in an efficient manner.
> >
> > As Alan mentioned we should add a traversable versioned view of a
> > JarFile, returning a Stream, from which it should be possible to
> > filter according to certain semantics.
> >
> > Paul.
> >
> >
> > > On 17 Sep 2017, at 12:27, Alan Bateman <[hidden email]>
> wrote:
> > >
> > > On 15/09/2017 22:58, Greg Wilkins wrote:
> > >> :
> > >>
> > >>  * I think the stream needs to handle inner classes and only include
> > >>    them if their matching outerclass is available at the same
> > >>    version.  So for example a base Foo$Bar.class will only be
> > >>    included if the stream includes a base Foo.class, and it will not
> > >>    be included if the Foo.class is version 9 or above.  Likewise a
> > >>    version 9 Foo$Bar.class will only be included in the stream if the
> > >>    stream also includes a version 9 Foo.class, and will not be
> > >>    included if the stream has a version 10 or above Foo.class
> > >>
> > >> If you think this last point is possible, then I'll move the
> > >> discussion
> > back the EE expert groups to try to get an agreement on the exact
> > stream code that will be used in the mid term until it is available in
> > the JRE lib, at which time the specs should be amended to say they
> > will defer the decision of which classes to scan the JRE lib so they
> > will be future proof for any changes in java 10, 11 etc.
> > >>
> > > I don't think this should be pushed down to the JarFile API. The
> > > JarFile
> > API provides the base API for accessing JAR files and should not be
> > concerned with the semantics or relationship between entries. I agree
> > that annotation scanning tools and libraries need to do additional
> > work to deal with orphaned or menacing inner classes in a MR JAR but
> > it's not too different to arranging a class path with a JAR file
> > containing the "classes for JDK 9" ahead of a JAR file containing the
> > version of the library that runs on JDK 8. I do think that further
> > checks could be done by the `jar` tool to identify issues at packaging
> time.
> > >
> > > -Alan
> >
> >
>
>
> --
> Greg Wilkins <[hidden email]> CTO http://webtide.com
>
>
>
>
>
> --
>
> Greg Wilkins <[hidden email]> CTO http://webtide.com
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Remi Forax
Hi Greg,
the notion of inner classes do not exist at runtime (apart if you explicitly ask by reflection).

The compiler desugar inner classes to several classes, so the compiler needs attributes (InnerClasses and EnclosingMethod) in the classfile to be able to reconstruct the java source file view of the world when it sees already compiled classes.

But the VM/runtime doesn't need those attributes, the VM doesn't care about inner classes. There is a comment at the end of section 4.7.6 of the JVMS that explain that the VM does not check the InnerClasses attribute at runtime "Oracle's Java Virtual Machine implementation does not check the consistency of an InnerClasses attribute against a class file representing a class or interface referenced by the attribute."

So to answer to your question, the classloader do not care about inner classes.

Note that in a close future with the introduction of nestmates [1], it will be another story, but we have at least to wait 18.9 for that.

Rémi
[1] http://openjdk.java.net/jeps/181

----- Mail original -----
> De: "Greg Wilkins" <[hidden email]>
> À: "Stephen Felts" <[hidden email]>
> Cc: "jigsaw-dev" <[hidden email]>, "core-libs-dev" <[hidden email]>
> Envoyé: Mardi 19 Septembre 2017 06:37:37
> Objet: Re: Scanning multi version jars?

> Stephen,
>
> I think the use-case can be pretty well defined.
>
> There should be an enumeration/iterator/stream available that provides the
> contents of a jar file as it would be seen/interpreted by the JVMs
> classloader.    So if the classloader is doing any processing to handle
> versioned classes/resources, then we need an iterator that implements the
> exact same logic.
>
> Which raises an interesting point....   with the multi versioned jar I have
> used as an example, which contains:
>
>   - org/example/Foo.class
>   - org/example/Foo$Bar.class
>   - META-INF/versions/9/org/example/Foo.class
>
> What does the classloader do when asked to load "org.example.Foo$Bar" ?
> If it loads it OK, then the JarFile enumerator/iterator/stream should also
> return it.   If it throws a ClassNotFoundException, then the
> JarFile enumerator/iterator/stream should skip it.
>
> Currently the classloader will happily return a resource for a base inner
> class even if its outerclass does not refer to it, so that suggests that
> the iteration should also not process out the inappropriate inner classes.
> However I think it could be argued that the loader should not load it.
>
> Eitherway, there should be an iteration available that is entirely
> consistent with what the classloader does.
>
> regards
>
>
>
>
> On 19 September 2017 at 13:41, Stephen Felts <[hidden email]>
> wrote:
>
>> Thanks for the clarification – I overstated the “any JarEntry”.
>>
>> I didn’t look at VersionedStream so I now understand the limitations you
>> mention.
>>
>>
>>
>> In my case, it’s necessary to look at all files in the jar file to do the
>> elimination of unneeded ordinary/inner classes so JarInputStream
>> getNextJarEntry()
>> can be used.
>>
>> By using the versioned JarFile constructor, getting the JarEntry returns
>> the right one for processing.  If I needed further filtering on the file
>> names, I’d need to return the real file names.
>>
>>
>>
>> Maybe the use case isn’t so universal or well defined.
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Greg Wilkins [mailto:[hidden email]]
>> *Sent:* Monday, September 18, 2017 9:33 PM
>> *To:* Stephen Felts <[hidden email]>
>> *Cc:* Paul Sandoz <[hidden email]>; jigsaw-dev <
>> [hidden email]>; [hidden email]
>>
>> *Subject:* Re: Scanning multi version jars?
>>
>>
>>
>> Stephen,
>>
>>
>>
>> It is not the case that the getName() always returns the path starting
>> with "META-INF/versions/". Specifically, if the entry is obtained from
>> getJarEntry() API (and not from the enumerator), then the name is that of
>> the unversioned file, but the metadata and contents obtained using the
>> jarEntry are for the versioned entry.
>>
>>
>>
>> For example the following code:
>>
>>
>>
>> JarFile jarFile = new JarFile(new File("/tmp/example.jar"),
>>                               false,
>>                               JarFile.OPEN_READ,
>>                               Runtime.version());
>> JarEntry entry = jarFile.getJarEntry("org/example/OnlyIn9.class");
>> System.err.printf("%s -> %s%n",entry.getName(),IO.toString(jarFile.
>> getInputStream(entry)));
>>
>> when run against a jar where the class files contain just the text of
>> their full path produces the following output:
>>
>>
>>
>> org/example/OnlyIn9.class -> META-INF/versions/9/org/example/OnlyIn9.class
>>
>>
>>
>> There is nothing in the public API of the JarEntry so obtained that
>> indicates that it the versioned entry, nor can I distinguish it from an
>> entry obtained by iteration that may report the same name (if the entry was
>> also in the base), although at least equals does return false.
>>
>>
>>
>> Moreover, the proposed stream API as represented by the current
>> implementation of jdk.internal.util.jar.VersionedStream, applies some
>> filtering based on the versioning and then converts it's enumerated
>> JarEntry instances to opaquely versioned JarEntry instances by calling
>> map(jf::getJarEntry),which thus hides the version information and makes
>> any additional filtering based on version impossible by any users of that
>> stream.
>>
>>
>>
>> regards
>>
>>
>>
>>
>>
>> On 19 September 2017 at 11:05, Stephen Felts <[hidden email]>
>> wrote:
>>
>> A versioned file name, JarEntry.getName(), starts with
>> "META-INF/versions/".
>> The version is the following string up to the next "/".
>> The version can be parsed with Runtime.Version.parse().
>> If not a versioned class file name, then use Jarfile.baseVersion().
>> That should be sufficient to get the version for any JarEntry.
>>
>> If it needs to run on pre-JDK9, this needs a lot of reflection.
>>
>> IMO Having a method that behaves as described below is likely to be needed
>> for many use cases and it would be good if someone wrote it and put it in a
>> well-known, public jar file.
>>
>>
>>
>> -----Original Message-----
>> From: Greg Wilkins [mailto:[hidden email]]
>> Sent: Monday, September 18, 2017 8:19 PM
>> To: Paul Sandoz <[hidden email]>
>> Cc: jigsaw-dev <[hidden email]>;
>> [hidden email]
>> Subject: Re: Scanning multi version jars?
>>
>> Paul,
>>
>> yeh... I guess I concede it's not JarFiles job... as much as that would
>> make things easier for containers to reach agreement:(
>>
>> However, can we at least look at having a new default method on JarEntry
>> to query the version. Without that, containers don't have the information
>> available to perform the semantic filtering required and thus will not be
>> able to use the stream API and will have to work from an unversioned stream.
>>
>> regards
>>
>> On 19 September 2017 at 03:04, Paul Sandoz <[hidden email]> wrote:
>>
>> > I agree with Alan here, we should not be pushing a semantic
>> > understanding of inner classes into JarFile.
>> >
>> > I do sympathise with the case of annotation class scanning, which has
>> > always tunnelled through the class loader view to directly get at
>> > class file bytes possibly dealing with various URI schemes, since that
>> > is currently the only effective way of accessing the required
>> > information in an efficient manner.
>> >
>> > As Alan mentioned we should add a traversable versioned view of a
>> > JarFile, returning a Stream, from which it should be possible to
>> > filter according to certain semantics.
>> >
>> > Paul.
>> >
>> >
>> > > On 17 Sep 2017, at 12:27, Alan Bateman <[hidden email]>
>> wrote:
>> > >
>> > > On 15/09/2017 22:58, Greg Wilkins wrote:
>> > >> :
>> > >>
>> > >>  * I think the stream needs to handle inner classes and only include
>> > >>    them if their matching outerclass is available at the same
>> > >>    version.  So for example a base Foo$Bar.class will only be
>> > >>    included if the stream includes a base Foo.class, and it will not
>> > >>    be included if the Foo.class is version 9 or above.  Likewise a
>> > >>    version 9 Foo$Bar.class will only be included in the stream if the
>> > >>    stream also includes a version 9 Foo.class, and will not be
>> > >>    included if the stream has a version 10 or above Foo.class
>> > >>
>> > >> If you think this last point is possible, then I'll move the
>> > >> discussion
>> > back the EE expert groups to try to get an agreement on the exact
>> > stream code that will be used in the mid term until it is available in
>> > the JRE lib, at which time the specs should be amended to say they
>> > will defer the decision of which classes to scan the JRE lib so they
>> > will be future proof for any changes in java 10, 11 etc.
>> > >>
>> > > I don't think this should be pushed down to the JarFile API. The
>> > > JarFile
>> > API provides the base API for accessing JAR files and should not be
>> > concerned with the semantics or relationship between entries. I agree
>> > that annotation scanning tools and libraries need to do additional
>> > work to deal with orphaned or menacing inner classes in a MR JAR but
>> > it's not too different to arranging a class path with a JAR file
>> > containing the "classes for JDK 9" ahead of a JAR file containing the
>> > version of the library that runs on JDK 8. I do think that further
>> > checks could be done by the `jar` tool to identify issues at packaging
>> time.
>> > >
>> > > -Alan
>> >
>> >
>>
>>
>> --
>> Greg Wilkins <[hidden email]> CTO http://webtide.com
>>
>>
>>
>>
>>
>> --
>>
>> Greg Wilkins <[hidden email]> CTO http://webtide.com
>>
>
>
>
> --
> Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: Scanning multi version jars?

Alan Bateman
In reply to this post by Greg Wilkins
On 19/09/2017 05:37, Greg Wilkins wrote:

> :
>
> Which raises an interesting point....   with the multi versioned jar I have
> used as an example, which contains:
>
>     - org/example/Foo.class
>     - org/example/Foo$Bar.class
>     - META-INF/versions/9/org/example/Foo.class
>
> What does the classloader do when asked to load "org.example.Foo$Bar" ?
>   If it loads it OK, then the JarFile enumerator/iterator/stream should also
> return it.   If it throws a ClassNotFoundException, then the
> JarFile enumerator/iterator/stream should skip it.
A class loader that loads from a JAR file will just map
"org.example.Foo$Bar" to entry "org/example/Foo$Bar.class" and attempt
to define the class bytes to VM.  It shouldn't care if the entry comes
from the base or a versioned section. It also shouldn't care if the
class name looks like it might have been compiled from an inner class.

The one case where a custom class loader does need to know more is when
it loading resources (findResource/findResources implementation
usually). For that case then the returned URL needs to locate the right
resource and so may encode a path to a resource in a versioned section.
You'll see URLClassLoader does the right thing, as does the built-in
class loaders for cases where you have modular MR JARs on the module
path. There were a few threads on core-libs-dev discussing whether to
add a getRealName method but in the end it was kicked down the road to
re-examine later.

-Alan
12