[INACTIVE][CHAT] RegexFilter v1.05 - Regular expression chat filter [432-860]

Discussion in 'Inactive/Unsupported Plugins' started by FloydATC, Feb 19, 2011.

  1. Offline

    FloydATC

    RegexFilter - Regular Expression chat filter
    Version: v1.05

    This plugin uses the power of Regular Expressions to filter anything you want from chat. Matching messages can be rewritten, logged and blocked, depending on the rules you define. Stop yourself from accidently sending those embarrasing .commands to chat. Warn users who use bad language and even turn their profanities into harmless language.

    This plugin comes with a default setup that effectively replaces my dotFilter and 7filter plugins, which will no longer be maintained.

    Features:
    • Command typos beginning with certain characters can be stopped (like . and 7)
    • Optionally recover those typos and execute the command as intended
    • Define your own macros or command aliases
    • Simple but powerful configuration with built-in debugging
    • Each regular expression is compiled only once => very fast
    • Supports config reload with "/regex reload"
    • Permissions aware, rules may apply to players, groups or permission nodes
    • Automatically kick players
    • Still no cake
    Download http://minecraft.atc.no/plugins/RegexFilter.jar

    Source code is included - no strings attached, no warranties implied

    Changelog
    Version 1.00
    • Original release
    Version 1.01
    • Changed priority to "lowest" for compatibility with iChat
    Version 1.02
    • Added "/regex reload" command and "then debug" statement
    Version 1.03
    • Removed the "stupidly long constructor" as per Bukkit team recommendation
    Version 1.04
    • Added "then command" to convert chat messages into commands
    Version 1.05
    • Added permission awareness, "then kick" and "then abort"
    Installation:
    • Download and copy to your "plugins" directory
    • Optionally create the directory "plugins/RegexPlugin"
    • Optionally create the file "plugins/RegexPlugin/rules.txt" and edit
    Configuration:

    Code:
    # Each rule must have one 'match' statement and atleast one 'then' statement
    # match <regular expression>
    # ignore user|group [space separated list]
    # ignore permission [node]
    # require user|group [space separated list]
    # require permission [node]
    # then replace <string>|warn [<string>]|log|deny|debug|kick|abort
    
    # Example 1:
    match f+u+c+k+
    then replace cluck
    then warn Watch your language please
    then log
    
    # Example 2:
    match dick
    then replace duck
    
    # Emulate DotFilter
    match ^\.[a-z]+
    then warn
    then deny
    
    # Emulate 7Filter
    match ^7[a-z]+
    then warn
    then deny
    
    # Quietly turn "(command" into "/command"
    match ^\((?=[a-z]+)
    then replace
    then command
    
    
    
    Please do NOT post questions here about how to write regular expressions, there are thousands of web sites discussing this topic. Google is your friend. The plugin uses java.util.regex and is CASE_INSENSITIVE. It is NOT possible to use parens to capture substrings to be used with replace. ALL matching rules are applied in the order they appear before any action is taken.
     
  2. Offline

    anon

    Too bad it doesnt work together with iChat :(
    Cool plugin thou.
     
  3. Offline

    oatmealpacket

    This is fantastic, thank you very much for it.
     
  4. Offline

    FloydATC

    I haven't updated my iChat in a while so there may be something I've missed here, what seems to be the problem?

    Update: I looked into this issue now and it seems the author of iChat changed the priority a bit when he implemented his censorship feature. I've just posted a minor update in which I have changed the priority of RegexFilter so it gets to peek at (and possibly rewrite) each message before iChat broadcasts it. This means they are now fully compatible with eachother and you may use both regex filtering and iChat censoring together. (Counter-intuitively, this means RegexFilter must run with lowest priority)

    Also, there was a stupid typo in the default rules.txt instructing you to use "rewrite" instead of "replace". This typo has been fixed and just in case anyone got confused I have changed the code to allow the use of "rewrite" to mean "replace".
    --- merged: Feb 20, 2011 1:12 PM ---
    Another quick update (v1.02) has just been posted:
    1. A "/regex reload" command has been added, which lets you reload rules.txt without restarting the server.
    2. A "then debug" statement has been added, which dumps the current state of the finite state machine.
    3. Error output when regex compilation fails has been slightly improved for clarity
    Let's get technical for a moment, shall we? RegexPlugin is at its heart a very simple finite state machine, controlled entirely by rules.txt which is in effect a minimalistic script language. It is important to understand that the entire script is run for every chat message, which means that one rule may affect the next one in subtle but important ways.

    As a server admin trying to figure this out, you may find it handy to be able to test creative use of profanities etc. without actually spamming your own server. You don't want to set a bad example, right? In my own rules.txt I have now added the following test rule at the end:

    Code:
    match ^\?
    then debug
    then deny
    
    In plain english: If the line begins with a question mark, then debug, then deny. Now I just have to prefix whatever test message I want with a question mark to get debug information on the server console and prevent that line from being broadcast to my players:
    Code:
    [Filter] Debug match: ^\?
    [Filter] Debug original: ?I fart in your general direction!
    [Filter] Debug matched: ?I fart in your general direction!
    [Filter] Debug current: ?I fart in your general direction!
    [Filter] Debug warning: (none)
    [Filter] Debug log: no
    [Filter] Debug deny: no
    You may choose a completely different prefix or approach. Note that unlike other statements, "then debug" actually takes effect immediately when encountered. This means you can use more than one "then debug" statement to check exactly where things get screwed up. This should make it far easier to troubleshoot complicated sets of rules.

    The debug output itself may need some explanation:
    original: The chat message as it was typed by the player
    matched: The chat message as it matched the current rule. This may be different from the original message if any "replace" statements have been applied before matching the current rule.
    current: The chat message as it is right now. This may be different from the matched message if any "replace" statements have been applied by the current rule.
    warning: Shows the current warning message that will be sent to the player, or (none) if no warning message has been set.
    log: Indicates if the message has been flagged for logging.
    deny: Indicates if the message has been flagged to be cancelled. If a message is logged but not denied, the log will in effect contain both the original message as it was typed by the player and the message as it is broadcasted.
     
  5. Offline

    anon

    Nice! This plugin will make my server realy fun. Can already think of evil stuff to do :>
     
  6. Offline

    FloydATC

    Feel free to post interesting rule sets :-D
     
  7. Offline

    oatmealpacket

    I mostly use this plugin to force the kids on my server to stop typing like idiots - filtering "omg" to "oh dear!", "wtf" to "gee whiz!", fail to "foolishness," correcting a wide variety of common misspellings, forcing them to capitalize I whenever used and so on. It's made my chat and the server logs infinitely easier to read, though I had to filter certain slurs to "gentleman" when they found that I'd filtered "lol" to "teehee~". I also denied the term "lag" in any context, because nothing is more spammy than players going "lag," "lag?", "laggy," or "i'm lagging" anytime the server hiccups.

    The rule set I'm really looking for, though, is capitalizing the first letter of every sentence. I've toyed around with this but I can't quite get the regex to work. If someone's made this one could you share?
     
  8. Offline

    fullwall

    To oatmealpacket:

    ^[a-z].*?\. [a-z] Will find the first letter and the one after the first stop. Looking up how to repeat.

    (\. [a-z]){0,5} Will match the first five occurences, I believe. But I'm not very good at regex :(.
     
  9. Offline

    oatmealpacket

    Awesome, I'll play around with that then. Thanks Fullwall.

    Incidentally, has anyone noticed this plugin causing some degree of lag? My players haven't mentioned it, of course, because they're no longer capable of doing so, but I think there might be a bit.
     
  10. Offline

    FloydATC

    LOL

    Lag (or rather, not introducing it) was my main concern when designing this plugin. Consider what happens if 50 players are chatting away and everything they say causes a set of 250 rules to go off. I'm still not 100% sure if I've covered all the potential pitfalls but my reasoning goes a little like this:

    First of all, if a player is chatting then that means he/she isn't moving or building so if the filter takes as much as a second to process the message then he/she isn't likely to notice it. A reasonably powerful server should be able to run a hundred simple expressions in a few milliseconds as long as they're precompiled. Since each player is handled by a separate thread, other players are not sitting around waiting for all of this processing anyway, so there would have to be a lot of chatting before server performance as a whole should take a hit.

    On the other hand, regular expressions are fickle beasts. I don't even know if java.util.regex supports look-ahead and look-behind but even without those it's quite possible to write a complex regular expression that can spend several seconds crunching a long line of text carefully designed to confuse it.

    It all comes down to just how many weird tricks you decide to put into rules.txt. Using anchors such as ^ (beginning of line), $ (end of line), \b (word boundary) etc. where appropriate can help to speed things up, long and non-greedy patterns will generally execute faster than short and greedy ones.

    At the end of the day, I reckon most movement trackers and cuboid plugins tax the server far worse than any chat filter, since they handle hundreds (if not thousands) of events every second.
     
  11. Offline

    RagingMonocle

    It doesn't seem to work on my server, what version of bukkit does it support?
    I use b289.
     
  12. Offline

    FloydATC

    Kinda hard to tell at the moment... :-/
    Code:
    2011-02-21 21:36:52 [INFO] This server is running Craftbukkit version git-Bukkit-"51dd641"-b{$bamboo.buildNumber}bmb (MC: 1.2_01)
    2011-02-21 21:36:52 [INFO] This server is also sporting some funky dev build of Bukkit!
    Last sunday, around lunch time? I suspect you need to get to atleast 300-something but I can't really tell. What happens, does it load at all or are you getting error messages from craftbukkit?
     
  13. Offline

    xTom

    dude, this is just awesome
     
  14. Offline

    Wulfspider

    This was what I was looking for! I have a huge list of words I filter being as everyone is always making up some variation to get around filters... I'll have to test it out when I get home today.
     
  15. Offline

    Archelaus

  16. Offline

    ibninja

    If I set
    Code:
    match ^\.h
    then replace /h
    and then enter ".help" it will have me say "/help" rather than running the command. Is there a way to make it actually run the command? (If this is possible, having a way to run multiple commands would also be nice, to make short cuts for worldedit and similar things)
     
  17. Offline

    FloydATC

    No, the plugin will not turn a chat event into a command event, sorry. Even if it could be done, the list of potential problems is just too long for a Java newbie like myself.
     
  18. Offline

    DerpinLlama

    Woo censorcraft? >_>
     
  19. Offline

    ibninja

    I'm not completely sure why, but adding

    Code:
    Boolean command = false;
    if (line.startsWith("then command ")) {
        command = true;
        player.chat(line.substring(14));
        valid = true;
    }
    and changing the closing to
    Code:
    if (command == true) {
        event.setCancelled(cancel);
    } else {
            event.setMessage(message);
            event.setCancelled(cancel);
    }
    in the player listener lets: then command "/command arguments
    actually run the command with arguments. The " is necessary, and having the right side also have one breaks it. This allows making shortcuts for a commands that don't require particularly complex arguments. Possibly useful. (it seems like maybe replacing event.setMessage(message); with player.chat(message); would allow this without breaking other things? I don't have the time to experiment with that just yet, though.)
     
  20. Offline

    FloydATC

    @ibninja: I missed the part that actually turns message into a command as if it was typed by the player? This would need to be done in a way that doesn't fall apart or cause other weird problems if there is a problem with the command itself (syntax error, lack of permissions, broken plugin...)

    Let's say plugin X works fine unless an admin makes a typo that causes the command to pass through RegexPlugin, which silently fixes the typo and introduces some subtle error that causes plugin X to misbehave. The author of that plugin might spend weeks troubleshooting.

    Don't get me wrong, I think the "then command" idea is interesting. I just want to get it done right if we are to do it at all :)
     
  21. Offline

    ibninja

    It does seem relatively easy to break things (or go into infinite loops) if you use player.chat instead of set.message. Setting it to allow ^/ without changes fixes this, but dealing with that for censorship functions would be nasty. I'll let you know if I come up with anything clever. Until that point, I guess I'll just adapt a version to my needs.
     
  22. Offline

    FloydATC

    I have just posted an update, version 1.04. After quite a bit of experimenting, I think I found a safe way to do it. As ibninja suggested, I have added a "then command" clause. (Thank you!) There are a couple of subtle differences though:

    1. The new "then command" takes an optional argument. If present, the chat message will be appended to that command. If not present, the message is used as-is. Rewrite if needed.
    2. The leading slash is hardcoded to prevent accidental chat loops.
    3. Whenever RegexFilter turns a chat message into a command, a statement to that effect is logged.
    4. The argument to "then replace" is now optional. If not present, the matching string be replaced with an empty string.

    The net result is that the following rule will convert .command into /command:
    Code:
    match ^\.(?=[a-z]+)
    then replace
    then command
    
    That looks a bit like line noise so here's how it works:
    ^ is the beginning of the string
    \. is a literal full stop (period) character
    (?=...) is a positive look-ahead assertion, this is a regular expression trick which means this part of the match will not be counted as part of the substring to be replaced.
    [a-z] is any character from a to z inclusive
    + means atleast one or more of the preceding character, class or pattern
    then replace simply removes the full stop (period) character
    then command simply turns the message into a command

    Not simple, not intuitive, but it does the trick.
     
  23. Offline

    Wulfspider

    I've tried this and it doesn't seem to work for me. Perhaps another chat plugin is interfering? The other word filters work fine though in public and party chat, but not private msg.

    Edit: Wow, I need to go to bed... I forgot that I didn't move over 1.04 to the plugins folder yet... ^_^
     
  24. Offline

    FloydATC

    This is to be expected I'm afraid. If those private messages begin with a command (prefixed with "/") then chat filters never get to see them. The same goes for any chat message intercepted by another chat filter than RegexFilter that has a lower priority and turn them into commands before RegexFilter gets a chance to look at them.

    Yesterday I was on my test server trying to debug a rule I had defined only on my production server. For 10 minutes. :)
     
  25. Offline

    TnT

    What build has this been tested to work with?
     
  26. Offline

    FloydATC

    Jenkins 432 and 440 and earlier has been tested by myself, updating title now.
     
  27. Offline

    Buchholdt

    Thanks, good work :)
     
  28. Offline

    Wolfy9247

    Could you please update to Craftbukkit b602? Thanks!
     
  29. Offline

    FloydATC

    Oops, I didn't notice it was broken, expect an update this afternoon. It's purely cosmetic, yes?

    Update: I've just tested RegexFilter v1.04 with 600, 602, 612 and 617 and didn't see any issues. Did I miss anything?
     
  30. Offline

    Wolfy9247

    Oh my sincere apologies, it is in-fact still working with the latest builds... well, as to my knowledge it was working fine when I loaded it. I must have updated with the wrong craftbukkit.jar before! (I need to organize the folder sometime...)

    Thanks though! ~
     

Share This Page