{"id":5132,"date":"2010-12-13T16:51:51","date_gmt":"2010-12-13T20:51:51","guid":{"rendered":"http:\/\/www.ezrasf.com\/wplog\/?p=5132"},"modified":"2010-12-13T16:53:04","modified_gmt":"2010-12-13T20:53:04","slug":"convert-webserver-log-to-csv","status":"publish","type":"post","link":"https:\/\/www.ezrasf.com\/wplog\/2010\/12\/13\/convert-webserver-log-to-csv\/","title":{"rendered":"Convert Webserver.log to CSV"},"content":{"rendered":"<p>A security guy at a campus wanted our web server log file in the CSV format. The original file has lines which look something like:<\/p>\n<blockquote><p>machine.usg.edu: webserver.log13646,2010-11-30 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a011:08:32 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a00.0010 \u00c2\u00a0999.999.999.999 \u00c2\u00a0 \u00c2\u00a0b7tPM1hTgGYMn90bLTM1 \u00c2\u00a0 \u00c2\u00a0200 \u00c2\u00a0 \u00c2\u00a0 GET \u00c2\u00a0 \u00c2\u00a0 \/webct\/urw\/lc987189066271.tp1333853785371\/blank.html \u00c2\u00a0 \u00c2\u00a0&#8211; \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 262 \u00c2\u00a0 \u00c2\u00a0 &#8220;Mozilla\/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit\/533.19.4 (KHTML, like Gecko) Version\/5.0.3 Safari\/533.19.4&#8221; username:0:0<\/p><\/blockquote>\n<p>Turns out I only need three sed edits to make it look the way I want:<\/p>\n<blockquote><p>sed &#8216;s|:2009-|,2009-|g&#8217; testfile.txt\u00c2\u00a0| sed &#8216;s|\\t|,|g&#8217; | sed &#8216;s|: |,|g&#8217;<\/p><\/blockquote>\n<p>The first converts the colon between the end of the file name and the year into a comma. The second converts all the tabs into commas, and the last changes the colon-space between the host name and webserver.log into a comma.<\/p>\n<p>Easy enough. That line from the web server log now looks like:<\/p>\n<blockquote><p>machine.usg.edu,webserver.log13646,2010-11-30,11:08:32,0.0010,999.999.999.999,b7tPM1hTgGYMn90bLTM1,200,GET, \/webct\/urw\/lc987189066271.tp1333853785371\/blank.html,-,262, &#8220;Mozilla\/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit\/533.19.4 (KHTML, like Gecko) Version\/5.0.3 Safari\/533.19.4&#8221;,username:0:0<\/p><\/blockquote>\n<p>I love regular expressions.<\/p>\n<p>I have a feeling I&#8217;ll need to make a primer for this guy too. \ud83d\ude41<\/p>\n<blockquote><p>Hostname,Log Name, Date, Time, Seconds to Process, Load Balancer IP, Session ID, HTTP Response Code, HTTP Method, URI, URI Parameters, Bytes Returned, User Agent, Username:Transactions Read:Transaction Written<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>A security guy at a campus wanted our web server log file in the CSV format. The original file has lines which look something like: machine.usg.edu: webserver.log13646,2010-11-30 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a011:08:32 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a00.0010 \u00c2\u00a0999.999.999.999 \u00c2\u00a0 \u00c2\u00a0b7tPM1hTgGYMn90bLTM1 \u00c2\u00a0 \u00c2\u00a0200 \u00c2\u00a0 \u00c2\u00a0 GET \u00c2\u00a0 \u00c2\u00a0 \/webct\/urw\/lc987189066271.tp1333853785371\/blank.html \u00c2\u00a0 \u00c2\u00a0&#8211; \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 262 \u00c2\u00a0 \u00c2\u00a0 &#8220;Mozilla\/5.0 [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[22,109],"tags":[1802,1801,479],"class_list":["post-5132","post","type-post","status-publish","format-standard","hentry","category-bbvista","category-unix","tag-csv-format","tag-host-name","tag-web-server"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1rUBW-1kM","jetpack-related-posts":[],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/posts\/5132","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/comments?post=5132"}],"version-history":[{"count":0,"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/posts\/5132\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/media?parent=5132"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/categories?post=5132"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ezrasf.com\/wplog\/wp-json\/wp\/v2\/tags?post=5132"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}