Brute Force Solution Using repeated SED commands

Note: there are many on line guides and tutorials on the SED filter command in Linux.

Here is just one such guide:

http://sed.sourceforge.net/sedfaq.html

The above link has a tutorial section and one very useful
tutorial is at this line:

http://www.cs.hmc.edu/tech_docs/qref/sed.html

Yesterday, I decided to find an IRC channel which discusses Ubuntu.

I looked in the synaptic manager and found an install called Kubuntu

It was a simple, quick install, and gave me a wonderful GUI IRC control panel, and automatically directed me to a forum discussing Ubuntu with 300 members present in the channel from around the world.

I made friends with a fellow in Brazil who was trying to get his ethernet card to work.

Konversation has an option to continuously LOG all conversation to disk.

I leave Konversation on all day long and collect the log. But the log is filled with extraneous lines about people Joining the channel, Parting the channel, changing their Nick, etc.

I wanted to find a way, using Linux commands, to filter out everything extraneous and leave only what people actually posted, so that I may more easily study it and learn some new thing.

I looked through my Linux Pocket Guide from O’Reilly. I also googed a bit, and decided to use the SED command.

I think the important thing to remember for any beginner is that there is more than one way to skin a cat. A very experienced programmer would know how to perform this task in one line, and that one line would probably look like an equation from quantum physics, and be difficult to understand. BUT you can also accomplish your desired task using many simple steps; many simple commands each of which is easy to understand and visualize.

The following code will work and do what I want when I paste it into the TERMINAL.

sed ‘3d’ /home/bryan/logs/logtest/test.log > output1.log
sed ‘/Join/d’ /home/bryan/output1.log > output2.log
sed ‘/Mode/d’ /home/bryan/output2.log > output1.log
sed ‘/Part/d’ /home/bryan/output1.log > output2.log
sed ‘/Quit/d’ /home/bryan/output2.log > output1.log
sed ‘/Nick/d’ /home/bryan/output1.log > output2.log
sed -e ‘s/\[[^]]*\]//g’ /home/bryan/output2.log > output1.log

IF YOU PASTE THE ABOVE IN TERMINAL, IT WILL EXECUTE
IN AN INSTANT AND PRODUCE output1.log as the desired result.

For ease of discussion, I shall number each line (but do not include the numbers in what you paste into TERMINAL for execution)

1. sed ‘3d’ /home/bryan/logs/logtest/test.log > output1.log
2. sed ‘/Join/d’ /home/bryan/output1.log > output2.log
3. sed ‘/Mode/d’ /home/bryan/output2.log > output1.log
4. sed ‘/Part/d’ /home/bryan/output1.log > output2.log
5. sed ‘/Quit/d’ /home/bryan/output2.log > output1.log
6. sed ‘/Nick/d’ /home/bryan/output1.log > output2.log
7. sed -e ‘s/\[[^]]*\]//g’ /home/bryan/output2.log > output1.log

The first line was my initial test of SED.
First, I copied the main IRC log to test.log
Next I issue line 1, a sed command which OMITS the first three
lines of test.log and outputs the rest to a file called
output1.log in my home directory.

Next, line 2 omits every line which has the string /Join/ and outputs the results to output2.

Line 3, omits every line which has the string /Mode/ and outputs the results to output1.

Line 4, omits every line which has the string /Quit/ and outputs the results to output1.

Line 5, omits every line which has the string /Part/ and outputs the results to output2.

Line 6, omits every line which has the string /Nick/ and outputs the results to output1.

Line 7 was the most difficult command to find in GOOGLE.
Line 7 DELETED all information occuring between left bracket [ and right bracket ] which in essence deletes all the time and date stamps.

For example, the raw log file looks something like this example excerpt:

#kubuntu (n=bryan@pool-70-19-93-222.ny325.east.verizon.net).
[Friday 20 November 2009] [17:54:37] Topic The channel topic is “Official Kubuntu support | Kubuntu 9.10 Karmic Koala released! | Download your free Kubuntu 9.10 CD iso: http://www.kubuntu.org/getkubuntu | For pressed CDs, please ask your LoCo Team | KDE 4.3.3 for Karmic in the backports: http://www.kubuntu.org/news/kde-4.3.3 | FAQ: https://wiki.kubuntu.org/FAQ | Pastes: http://paste.ubuntu.com | Chat in #kubuntu-offtopic | Please respect the Ubuntu IRC guidelines: https://wiki.kubuntu.org/IrcGuideli”.
[Friday 20 November 2009] [17:54:37] Topic The topic was set by Mamarok on 2009-11-04 14:33.
[Friday 20 November 2009] [17:54:37] Channel [freenode-info] please register your nickname…don’t forget to auto-identify! http://freenode.net/faq.shtml#nicksetup
[Friday 20 November 2009] [17:54:38] Join giuseppe__ has joined this channel (n=giuseppe@net-188-217-142-51.cust.dsl.vodafone.it).
[Friday 20 November 2009] [17:54:39] there must be a server which is listening to that port. If not: what will happen to connections to the port?: nothing
[Friday 20 November 2009] [17:54:43] Mode Channel modes: topic protection, no messages from outside, no colors allowed, L, f, J
[Friday 20 November 2009] [17:54:43] Created This channel was created on 2006-11-26 01:42.
[Friday 20 November 2009] [17:55:01] so i think about to start kubuntu on sdb in virtualbox and copy file for file to kubuntu and try it out
[Friday 20 November 2009] [17:55:11] Join taki_ has joined this channel (n=quassel@41.141.85.174).

AND THE REFORMATTED FINAL LOG FILE LOOKS LIKE THIS PORTION:

!hyperterminal
Sorry, I don’t know anything about hyperterminal
!hyper terminal
-_-
goes_de: no, not the same kde-folder. Thats wrong… suse have some individual configs
Sorry, I don’t know anything about hyper terminal
I did one Wubi install of Ubuntu on a Windows XP, and then a full Ubuntu on a Gateway tower, and now I am trying to connect via Ethernet DSL
xterm?
goes_de: copy and paste will break kubuntus kde
DTsan: what exacly are ya looking for?
I tried installing iirc with synaptic manager, but no obvious way to launch it, so Konversation is much better, right on the Internet menu
does everyone here use Ubuntu? (is this a dumb question?)
lol
WilliamBuell: I’m forced to use ubuntu at school
avihay: i need a program that acts like Hyper Terminal, perferable one that does rtv995 emulation
I am blogging step by step with each thing do in Ubuntu, so I will have documentation, and can remember later
WilliamBuell: i have found that not everyone but most do i still can’t stop laughing ubuntu kubuntu not much differnce just window manager
DTsan: something that will talk with the serial port?
ohhhh i see, kubuntu is a seperate flavor of, what, debian
no
does anyone here blog about ubuntu, you are welcome to look at my blog, .. it might help some beginners out
no, work over lan
to an ip address
like telnet?
yeah
avihay: where is your school located, what state, what country, I read that some countries like Switzerland are making Linux manditory
well ubuntu used to be based off of debian but now 99% on there own WilliamBuell

+++++++++++++++++

One more knowledgeable programmer in IRC suggested the following to me:

sed -r ‘/(Join)|(Mode)|(Part)|(Quit)|(Nick)/d’ logfile
will delete out lines with Join, Mode, Part, Quit or Nick in them

Advertisements

Tags: , , ,

One Response to “Brute Force Solution Using repeated SED commands”

  1. noren Says:

    good to see u on the ubuntu irc . its really good to meet some new and avid fan of ubuntu. even i have been using ubuntu since last three years and learning new thing everyday.
    cheers
    keep learning

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: