[TUTORIAL|EXTREME] Beyond Reflection - ASM - Replacing Loaded Classes

Discussion in 'Resources' started by Icyene, Sep 5, 2012.

Thread Status:
Not open for further replies.
  1. Offline


    When I wrote the first part of this tutorial I didn't feel good. AspectJ is awesome, but has some pretty big backdraws:

    • With the current AspectJ you cannot weave already loaded classes, making it useless for plugins wishing to "hack" already loaded classes, eg. Bukkit.
    • The runtime is pretty big: 169KB.
    So I thought to myself, "There has to be a better way to do this". Now I am in the position to say, "yes, there is. And, here is how". This tutorial aims at creating a nearly identical profiler as shown in the previous tutorial, but which will be able to runtime bind to a class, and will use a much smaller library (ASM).

    This will be a long tutorial, so if you do not have at least one hour to spare to learn this amazing thing, its not worth beginning.

    As said in the title, this is an extremely hard thing to do (efficiently). Our agenda (should you choose to start):

    • Explain how reinstrumenting works
    • Discussing the Java Attach API; its possibilities and limitations
    • Explain the basis of a ASM based profiler
    • Make a late-bind attacher
    • Make a late-bind agent
    So with that in mind, lets begin!

    Using ASM is much harder than using AspectJ. AspectJ is based on ASM, but provides much cleaner interfaces. The real world of ASM is very grim, and you will be dealing with programs on a bytecode level for quite a bit. Luckily, there is an ASM plugin for Eclipse to help speed things up. Additionally, reference the ASM libraries. They are downloadable from here.

    But I've forgotten to say what ASM is. ASM is a bytecode engineering library. It allows programs to change the code of other programs, without decompiling.

    Reinstrumenting is the fancy way of saying "modifying". Java provides a small API for this, which gives basic tools to make instrumenters. Here is a short outline:

    • The com.sun.* classes. These are mainly for instrumentation. They come with the JDK, in the library "tools.jar". That said, tools.jar does not exist in a JRE, and is 14Mb in size. If you want to use this in a plugin, you will have to ship the necessary classes from tools.jar. In this tutorial we only use about 10, which is reasonable.
    • A reinstrumenter is called an agent.
    • The point of entry for an agent's class is agentmain(String, Intrumentation).
    • Agents have a transform() method that is called to instrument the class provided to it.
    Java Attach API
    Reinstrumenting is a trivial thing to do with applications that are loaded as their target classes are loaded. However, in the case of Bukkit, Bukkit loads before your plugin does, so you cannot reinstrument Bukkit. To address this issue, Java 1.6 introduced the attach API. This is another part of the com.sun.* package, and contains a basic injector. This injector is written in native code. It injects attach.(dll|so) into the target java.exe process. The library comes with the JDK ONLY, so you will have to ship or come up with a clever solution to get it into the users JRE. Luckily, the dll is small (15KB on Windows, 12 on Solaris, etc).

    The Attach API looks for the dll in %JAVA_PATH%/jre/lib. If you copy the attach.lib there, it would load without problems. Note that if you are using Eclipse, all code in this tutorial will produce an UnsatisfiedLinkError. This is because Eclipse uses the JRE to load your apps. You will have to copy the attach.lib, and reference tools.jar.

    ASM Profiler
    I won't go into too much detail, but ASM provides ways to inject code into methods being reinstrumented. Basically, you can add a new method call inside a method with ASM, using visitMethodInsn in the visitCode() function of a adapter (more on that later).

    Coding a Late Bind Attacher
    This is where the fun starts. Here we use the Java Attach API to load our agent into the target JVM. Note that plugins share the same PID as Bukkit, so the PID of the plugin will match that of Bukkit.

    First, we need to create a method to get the PID. This is simple enough, using Java's ManagementFactory.

    2. public static String getPidFromRuntimeMBean() {
    3. String jvm = ManagementFactory.getRuntimeMXBean().getName();
    4. String pid = jvm.substring(0, jvm.indexOf('@'));
    5. return pid;
    6. }

    Next, we actually have to create the loader. We need to create a jar file for the agent on the fly, and create a manifest file for it as well. For an agent to be allowed to reinstrument a class, its manifest must have the attribute "Agent-Class" pointing to the main agent class, as well as "Can-Retransform-Classes" and "Can-Redefine-Classes" set to "true".

    So without any further ado, lets begin. We first create a temporary agent jar that will be cleaned up on exit.

    2. public static void attachAgentToJVM(Class<?>[] agentClasses, String JVMPid) {
    3. final File jarFile = File.createTempFile("agent", ".jar");
    4. jarFile.deleteOnExit();
    5. }

    Now we create our manifest and write it to the jar:

    2. final Manifest manifest = new Manifest();
    3. final Attributes mainAttributes = manifest.getMainAttributes();
    4. mainAttributes.put(Attributes.Name.MANIFEST_VERSION, "1.0");
    5. mainAttributes.put(new Attributes.Name("Agent-Class"), Agent.class.getName());
    6. mainAttributes.put(new Attributes.Name("Can-Retransform-Classes"), "true");
    7. mainAttributes.put(new Attributes.Name("Can-Redefine-Classes"), "true");
    8. final JarOutputStream jos = new JarOutputStream(new FileOutputStream(jarFile), manifest);

    Next, we iterate through the Class<?>[] given to us and add all the classes there to the jar file. Note that at the moment there are some missing methods here, we will get to those later:

    2. for(Class<?> clazz: agentClasses) {
    3. final JarEntry agent = new JarEntry(clazz.getName().replace('.', '/') + ".class");
    4. jos.putNextEntry(agent);
    6. jos.write(getBytesFromIS(clazz.getClassLoader()
    7. .getResourceAsStream(clazz.getName().replace('.', '/') + ".class")));
    8. jos.closeEntry();
    9. }

    What we are doing is we are getting the bytes from the class stream, and writing it to the jar. Note that we unqualify the class name prior to this.

    The above code requires a couple of IO-related functions. If you have an Apache Commons dependency already, you can use its IOUtils class. Otherwise, some simple functions will suffice.

    2. /**
    3. * Gets bytes from InputStream
    4. *
    5. * @param stream
    6. * The InputStream
    7. * @return
    8. * Returns a byte[] representation of given stream
    9. */
    11. public static byte[] getBytesFromIS(InputStream stream) {
    13. try {
    14. int nRead;
    15. byte[] data = new byte[16384];
    17. while ((nRead = stream.read(data, 0, data.length)) != -1) {
    18. buffer.write(data, 0, nRead);
    19. }
    21. buffer.flush();
    22. } catch (Exception e) {
    23. System.err.println("Failed to convert IS to byte[]!");
    24. e.printStackTrace();
    25. }
    27. return buffer.toByteArray();
    29. }
    31. /**
    32. * Gets bytes from class
    33. *
    34. * @param clazz
    35. * The class
    36. * @return
    37. * Returns a byte[] representation of given class
    38. */
    40. public static byte[] getBytesFromClass(Class<?> clazz) {
    41. return getBytesFromIS(clazz.getClassLoader().getResourceAsStream( clazz.getName().replace('.', '/') + ".class"));
    42. }

    Now just close the jar with jos.close(), and we can finally actually attach the agent! I made it sound much more difficult than it actually is. You attach the agent in 3 simple lines:

    2. VirtualMachine vm = VirtualMachine.attach(JVMPid);
    3. vm.loadAgent(jarFile.getAbsolutePath());
    4. vm.detach();

    Great, now we have a functioning agent loader. Now all we need to write is the agent itself!

    The Agent
    All agents extend ClassFileTransformer, and have an agentmain (instead of a main) method. The most basic agent (which does nothing) looks like this.

    2. public class Agent implements ClassFileTransformer {
    3. public static void agentmain(String s, Instrumentation i) {
    4. }
    5. @Override
    6. public byte[] transform(ClassLoader loader, String className,
    7. Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
    8. byte[] classfileBuffer) throws IllegalClassFormatException {
    9. // As far as useless goes, this is pretty useless
    10. }

    The magic occurs in what happens in transform. To begin, declare two variables:

    2. private static Instrumentation instrumentation = null;
    3. private static Agent transformer;

    And in agentmain add:

    2. System.out.println("Agent loaded!");
    4. // initialization code:
    5. transformer = new Agent();
    6. instrumentation = i;
    7. instrumentation.addTransformer(transformer);
    8. // to instrument, first revert all added bytecode:
    9. // call retransformClasses() on all modifiable classes...
    10. try {
    11. instrumentation.redefineClasses(new ClassDefinition(Test.class,
    12. Util.getBytesFromClass(Test.class)));
    13. } catch (Exception e) {
    14. e.printStackTrace();
    15. System.out.println("Failed to redefine class!");
    16. }

    What we are doing here is we are creating a new Agent, and making it redifine Test.class. redefineClasses accepts a ClassDefinition, which is essentially a container that stores a class and the byte[] representation of that class.

    If refineClasses succeeded, then transform is called on Test.class. It is now our job to insert some transformation logic into it.

    For this tutorial, we will write a simple profiling agent.

    Create a class called Profile, and add what you would want your start and end calls to look like. For this tutorial, I am using the following class:

    2. public class Profile {
    4. public static void start(String className, String methodName) {
    5. System.out.println(new StringBuilder(className)
    6. .append('\t')
    7. .append(methodName)
    8. .append("\tstart\t")
    9. .append(System.currentTimeMillis()));
    10. }
    11. public static void end(String className, String methodName) {
    12. System.out.println(new StringBuilder(className)
    13. .append('\t')
    14. .append(methodName)
    15. .append("\tend\t")
    16. .append(System.currentTimeMillis()));
    17. }
    18. }

    Pretty basic stuff. If we were profiling by hand, we'd have to insert Profile.start("...", "...") and Profile.end("...", "...") at the start and end of every function. That is tedious, so let's get ASM to do it for us!

    First, we need to add some basic sanity checks. We cannot profile ourselves, in fear of StackOverflow, and we cannot profile anything using something other than the System classloader as our classes won't be visible to it. Our transform must perform these checks before attempting any transformations.

    2. if (loader != ClassLoader.getSystemClassLoader()) {
    3. System.err
    4. .println(className
    5. + " is not using the system loader, and so cannot be loaded!");
    6. return classfileBuffer;
    7. }
    8. if (className.startsWith("com/github/Icyene/LateBindAgent")) {
    9. System.err
    10. .println(className
    11. + " is part of profiling classes. No StackOverflow for you!");
    12. return classfileBuffer;
    13. }

    Transform has to return a byte[] representation of the new, profile-enchanced class. So we need to declare a container for those bytes, and fill it with the classFileBuffer given (the class in its current state).

    2. byte[] result = classfileBuffer;

    Next we instrument the class with some classes we haven't created yet, but will.

    2. try {
    3. //Create class reader from buffer
    4. ClassReader reader = new ClassReader(classfileBuffer);
    5. //Make writer
    6. ClassWriter writer = new ClassWriter(true);
    7. ClassAdapter profiler = new ProfileClassAdapter(writer, className);
    8. //Add the class adapter as a modifier
    9. reader.accept(profiler, true);
    10. result = writer.toByteArray();
    11. System.out.println("Returning reinstrumented class: " + className);
    12. } catch (Exception e) {
    13. e.printStackTrace();
    14. }
    15. return result;

    It might seem intimidating, but all we are doing is creating a reader for the class in its current state, and creating a writer. A ClassAdapter is the magic of this entire thing; it is what modifies all the methods to include profiling info. Then we make the reader accept the profiler, and get our result byte[], which we then return.

    Now we need to actually make ProfileClassAdapter.class. For things to go smoothly, it must be a wrapper around ProfileMethodAdapter.class.

    ProfileClassAdapter is, once again, pretty basic.

    2. public class ProfileClassAdapter extends ClassAdapter {
    4. private String className;
    6. public ProfileClassAdapter(ClassVisitor visitor, String theClass) {
    7. super(visitor);
    8. this.className = theClass;
    9. }
    11. public MethodVisitor visitMethod(int access, String name, String desc,
    12. String signature, String[] exceptions) {
    14. MethodVisitor mv = super.visitMethod(access,
    15. name,
    16. desc,
    17. signature,
    18. exceptions);
    20. return new ProfileMethodAdapter(mv, className, name);
    21. }
    22. }

    We just return a new method adapter for each method we are given, in our case, a ProfileMethodAdapter.

    2. public class ProfileMethodAdapter extends MethodAdapter {
    3. private String _className, _methodName;
    5. public ProfileMethodAdapter(MethodVisitor visitor,
    6. String className,
    7. String methodName) {
    8. super(visitor);
    9. _className = className;
    10. _methodName = methodName;
    11. System.out.println("Profiled " + methodName + " in class " + className
    12. + ".");
    13. }
    15. public void visitCode() {
    16. this.visitLdcInsn(_className);
    17. this.visitLdcInsn(_methodName);
    18. this.visitMethodInsn(INVOKESTATIC,
    19. "com/github/Icyene/LateBindAgent/Profile",
    20. "start",
    21. "(Ljava/lang/String;Ljava/lang/String;)V");
    22. super.visitCode();
    23. }
    25. public void visitInsn(int inst) {
    26. switch (inst) {
    27. case Opcodes.ARETURN:
    28. case Opcodes.DRETURN:
    29. case Opcodes.FRETURN:
    30. case Opcodes.IRETURN:
    31. case Opcodes.LRETURN:
    32. case Opcodes.RETURN:
    33. case Opcodes.ATHROW:
    34. this.visitLdcInsn(_className);
    35. this.visitLdcInsn(_methodName);
    36. this.visitMethodInsn(INVOKESTATIC,
    37. "com/github/Icyene/LateBindAgent/Profile",
    38. "end",
    39. "(Ljava/lang/String;Ljava/lang/String;)V");
    41. break;
    42. default:
    43. break;
    44. }
    46. super.visitInsn(inst);
    47. }
    49. }

    What we are doing is when a method is visited (visitCode), we are adding a static call to com.github.Icyene.LateBindAgent.Profile.start(String, String) as the first instruction in the method, and passing it _methodName and _className.

    If you are unfamiliar with stack-based processing, just visualize a stack that you can push stuff on. We are pushing the arguments, then invoking the method. Java then gets the last X (the number of arguments the called function takes) pushes to the stack, and calls the argument with those items. So when we push the arguments to the stack and then call our method, we are supplying our method with those values.

    To test your profiler out, just run this small app. Make sure it is not in the same package as the profiler, or it will not be modified (anti-StackOverflow sanity check).

    2. public class Test {
    4. public static void main(String[] args) {
    6. Util.attachAgentToJVM(new Class<?>[] { Agent.class, Util.class,
    7. Profile.class, ProfileClassAdapter.class,
    8. ProfileMethodAdapter.class }, Util.getPidFromRuntimeMBean());
    10. sayHello(5);
    11. sayWorld();
    13. }
    15. public static void sayHello(int s) {
    16. System.out.println("Hello");
    17. }
    19. public static void sayWorld() {
    20. System.out.println("World!");
    21. }
    23. }

    If you have done everything correctly, it should print out:

    java/lang/instrument/ClassDefinition is not using the system loader, and so cannot be loaded!
    Agent loaded!
    Instrumenting class: java/lang/instrument/ClassDefinition
    Instrumenting class: com/github/Icyene/Test/Test
    Profiled <init> in class com/github/Icyene/Test/Test.
    Profiled main in class com/github/Icyene/Test/Test.
    Profiled sayHello in class com/github/Icyene/Test/Test.
    Profiled sayWorld in class com/github/Icyene/Test/Test.
    Returning reinstrumented class: com/github/Icyene/Test/Test
    com/github/Icyene/Test/Test    sayHello    start    1346891246109
    com/github/Icyene/Test/Test    sayHello    end    1346891246109
    com/github/Icyene/Test/Test    sayWorld    start    1346891246109
    com/github/Icyene/Test/Test    sayWorld    end    1346891246109
    And that is it! You are done! Having trouble getting everything to work? All the code is on Github.

    Addition 1 - fully replacing a class:
    If you are transforming a static class, and do not need dynamic retransforming, ASM is not required! All you need is a byte[] of the new class and return it on transform. Additionally to be able to kill your agent add this to Agent.java:

    2. /**
    3. * Kills this agent
    4. */
    5. public static void killAgent() {
    6. instrumentation.removeTransformer(transformer);
    7. }

    Addition 2:
    The Github example is quite different from what is stated above. While it as the same idea at its basis, I nested most classes to make Agent.class encompass everything except for the Util class. It looks a bit neater (albeit longer) now.

    Addition 3 - tricking Java into loading your library:
    Using this method, it will only work on machines that have a JDK installed, due to the lack of attach.dll/so on normal machines. Here is a code snippet that you'd call in a static{} section of code. It basically gets a JVM to load your library. It is a bit tricky since we must add our location to java.library.path, and that is cached. Therefore, we must actually use reflection to "hack" Java itself to clear the cache. The code"

    2. static {
    4. // Get important dll, copy to a folder
    6. String path = "C:\\Tutorials\\AttachDir";
    8. if (System.getProperty("java.library.path") != null) {
    9. System.setProperty(
    10. "java.library.path",
    11. path + System.getProperty("path.separator")
    12. + System.getProperty("java.library.path"));
    13. } else {
    14. System.setProperty("java.library.path", path);
    15. }
    17. try {
    18. Field fieldSysPath = ClassLoader.class
    19. .getDeclaredField("sys_paths");
    20. fieldSysPath.setAccessible(true);
    21. fieldSysPath.set(null, null); //Clear library.path's !@#$ing cache.
    22. System.out.println("Path: "
    23. + System.getProperty("java.library.path"));
    24. System.out.println("Attempting attaching...");
    25. System.loadLibrary("attach");
    26. } catch (Exception e) {
    27. System.out.println("Attaching fail!");
    28. e.printStackTrace();
    29. }
    31. System.out.println("Attaching success!");
    32. }

    With this method using transforming in Bukkit plugins actually becomes possible! Just change C:\Tutorials\AttachDir to something like YourPluginFolder/libraries, and you are all set!
    SpoKaPh, FisheyLP, Skionz and 11 others like this.
  2. Offline


    [ADVANCED] Beyond Reflection - AspectJ - Tracing

    If you have a JDK on an operating system not listed here, it would be great if you would upload your attach.(so|dll) for people to use. For them to be allowed to upload they may not be .(dll|so), so just rename them to .txt etc.


    Attached Files:

  3. Offline


    md_5 Now with the example Github repo you demanded! :p
  4. Offline


  5. Offline


    Tutorials marked extreme are not for everyone :p
  6. Offline


    I hope to get to this point eventually. I am still... ... ... yeah.
    Self taught? :p
    xImaginex likes this.
  7. Offline


    hawkfalcon Mhm... I started when I was 10-ish.
    krazytraynz and hawkfalcon like this.
  8. Offline


    This has so many possibilities! Can't wait to use it.

    Wouldn't it be awesome if Mojang put a framework to receive (validated) agents in the minecraft client from servers, so servers could provide client side mods without the players having to install them manually? Perhaps even store them locally in a per-server profile.

    Ah, one can dream.
  9. Offline


    And you are 15. I started when I was 13, but didn't really start getting it until I was almost 15. :p
  10. Offline


    That would be pretty cool. But agents, as I have shown, are quite complex to get right. There would be limited people who would use them due to that. Its not really something a "script kiddy" can use. One can think of that as an advantage. MC would better just allow a java program to get downloaded and be executed in a "sandbox" enviroment. Maybe feed it proxied objects, etc.

    I clicked the wrong thing when Bukkit asked for my age... I'm nearly 14 :p NEARLY.
  11. Offline


    :confused: Wow you are good. I hope to achieve your level of knowledge at some point. I want to ask you a question, but I feel like i've spammed this tread enough, so I will pm you.
  12. Offline


    Thanks :3
    hawkfalcon likes this.
  13. Offline


    Ah, true. Maybe Mojang could add API methods to the plugin API they are currently making. But I'm sure that would be very limited, and still confuse a lot of people.

    Btw, the link to your github example has a trailing 't' at the end.
  14. Offline


    MrFigg If only MC was dumb enough to execute files inside zips... You could include a install.bat/sh etc in a texture pack, and have it install a client mod for your plugin. No Spout required! :D

    Thanks, fixed :p
  15. Offline


    That would also potentially be a huge security flaw though :p
  16. Offline


    Very nice, makes an advanced topic like this approachable to more users.
  17. Offline


    Thanks, that was what I was aiming for :)

    I updated the github example to not need ASM if you are modifying a class that you know the structure of already. If, say, I wanted to hack WorldServer.java I'd compile a craftbukkit.jar with my changes in WorldServer. Then I'd copy the WorldServer.class, and rename it to WorldServer.class.hack, for example. I can use the getBytesFromResource("WorldServer.class.hack") that is included in my minimalistic API. Finally, on transform, I can check if the class being transformed is called WorldServer, and if so return the bytes of WorldServer.class.hack. No ASM needed, and I have completely overridden the class!

    Most likely why they didn't do that :p

    EDIT by Moderator: merged posts, please use the edit button instead of double posting.
    Last edited by a moderator: May 28, 2016
  18. Offline


    Awesome. Does it also override already instantiated objects like the normal Java hotswap does?
  19. Offline


    md_5 I dunno. I'd say so.

    ATM I am working on a way to get the JVM to use an attach.dll that is not in JRE/lib (because of access problems on Linux servers when users aren't root).
  20. Offline


    md_5 Got it to work; you had to clear the Java path cache for you to be able to modify it to add your library >.<. I added Addition 3 on the OP showing a code snippet for it. On another note, I noticed at one point when you gave me the snippet for obtaining a PID from the AspectJ post you said its only tested on Linux... Does that mean you have a Linux machine? If so, could you maybe upload the attach.so? :p

    Additionally, this API now has JavaDocs!
  21. Offline


    Wow... you do not act like most 14-year-olds. I presumed you were closer to my age. I feel old now. When I was 14 the only programming I was doing was Perl and TI-83 Basic (in school while I was supposed to be paying attention) and eventually Z80 assembly code... I doubt I could have figured out anything like this.

    This is really cool by the way.
  22. Offline


    Just got home and read this entire thing. I m trying to use this to intercept some packets sent by the servers without modifying the jar, hopefully I can get something out tonight :p

    Thanks Icyene!
  23. Offline


    Courier Thanks :) I can very much relate to that... Geography class: pretending to read while actually writing Python. Well, there are now tons of resources on the internet, and the magically useful StackOverflow. 2 problems involving this were solved there. Thanks once again!

    Supertt007 No problem :) You might find this useful. Its a small, lightweight framework that xiaomao and I are developing. It doesn't make making agents less painful, but it does make loading them a piece of cake. Additionally, when it is completed it will autoinstall the needed native libraries. It also has JavaDocs if you like those :p
  24. Offline


    Icyene likes this.
  25. Offline


    Giant THANKS SO MUCH! (!!!) When you have the time, perhaps could you try swapping the dll I posted with yours?

    It would determine if I need a different one for each x, or if I only need a x32.
  26. Offline


    Icyene no problem! :) Doesn't look very hard though, I do have one question how ever... Where does "INVOKESTATIC" come from, or is that a native Java define?

    I'll give it a try, haven't really tested out ASM yet though (A)... Also, Never mind on the previous question, after looking at the github repo, turns out it's an opcode import as I sort of had figured already! :D

    Edit 2:
    By just looking at file sizes, I can already see they differ! The one in your github repo, is 18kb, where as mine is 20kb....
  27. Offline


    It comes from
    import static org.objectweb.asm.Opcodes.INVOKESTATIC;
    Its an integer that ASM made a nice variable for. Otherwise its just 184, so its harder to remember. Change to INVOKEDYNAMIC if your method isn't static, or INVOKESPECIAL if its... special...
  28. Offline


    Icyene had already found that, see my first edit in my previous post :D
  29. Offline


    I see :) Well normally it should still be able to load and do its job, I'd just like to make sure :p
  30. Offline


    Well, considering the file size difference, am sceptical on if it will load... I suspect the 64 bit version has a few different hooks or such? (Giving a 32bit library when it expects a 64bit is quite fun btw :p)
Thread Status:
Not open for further replies.

Share This Page