information

You are currently browsing articles tagged information.

Catherine who I follow on Twitter retweeted about agreeing with this blog post:

Last night while sitting at a pub with some friends, the topic of information came up. My friend Tom, in particular, had a few interesting things to say about it. I asked him if he thought that constantly being tapped into the stream of information that the online world affords us was bad thing. Is our constant connection to blog posts, news articles, video, podcasts, Twitter, and Facebook more detrimental than positive? Are we a culture of information junkies?

His response, essentially was “no”. He basically said that in fact (and I’m paraphrasing) we have always been able to tap into information whenever we wanted to. Back in the old days, Tom said he used to scour through encyclopedias, magazines, and books all day long. He was always consuming information and learning new things. I thought back to my younger days and realized I did much the same. You probably did too. The difference is, said Tom, these days information is with us wherever we go. We carry the encyclopedia, and magazines, and book in our pockets. Information is always there, on any topic. It’s an amazing thing, said Tom, and it’s a good thing.

My reading list Back in the old days I owned more books a 9 years old than all of my close friends and their parents put together. I had read every novel more than once. Around a sixth of my books were science and history, which were often partially re-read, often as a result of looking for something specific to get proof of something. It could have been something novel that needed understanding. It could have been a claim by someone else. It probably did not help that from seven to ten years old, I spent a couple hours every week day after school at the library.

Later, as a teen when I played Dungeons and Dragons, I knew my books well enough I could open a book to within a few pages of the information I wanted. Of course, I replaced my Player’s Guide and Dungeon Master’s Handbook three times when they fell apart from overuse.

As a college student, I worked in the university library. It had a larger and much more stimulating collection of journals, books, maps, microfilm, and microfiche than the public library of my childhood. Some thought me a dedicated worker. Ha! Having unfettered access to information was the best form of entertainment.

Only working with computer systems and the Internet, so more information could tear me away from the libraries.

But am I an information junkie? Oh, yeah, absolutely. Finding that kernel of information that answers a question, solves a problem, or wins an argument causes a surge of dopamine. People get the same dopamine high from winning a game.

The delayed gratification of waiting to search my or another library or waiting until I got to a computer to search the Internet was probably a good thing. I’m learning to be better about not whipping out my phone to track down every possible query that comes across my head. Probably a good thing to wait.

Do we consume too much information? I might. Lately I have thought about reducing the amount of following I do. Typically I hit this point when I realize it takes me all weekend to catch up. To be fair I reach this point by getting all caught up over a long weekend and seeking out new stuff.

  • 40% the blog or news feeds (over 100),
  • 40% Facebook friends (remove over 250),
  • 40% Tumblr following, (remove 45),
  • 40% Twitter following (remove 100),

Then there are the potential stoppages:

  • Stop following tags on WordPress.com, Tumblr, Blogspot, Flickr.
  • Stop using some social media sites entirely like Google+ or Diaspora.

Given my social preferences, I have lots of time to spend online.

Here is the true meaning and value of compassion and nonviolence, when it helps us to see the enemy’s point of view, to hear his questions, to know his assessment of ourselves. For from his view we may indeed see the basic weaknesses of our own condition, and if we are mature, we may learn and grow and profit from the wisdom of the brothers who are called the opposition.

From the Martin Luther King, Jr. entry on wikiquote.

At brunch yesterday, the point was being made to me over and over that if climate change advocates could ask deniers, “What would it take to convince you?” and give that data or answers, then that would spark the necessary dialogue to help both sides understand each other. Running across the quote above, it struck me as quite funny and unsurprising that I would be on the wrong side of MLK.

As though proving my point, my repeated argument that ideology trumps facts according to studies fell on deaf ears. False information (such as a misleading negative campaign ad) agreeing with a person’s ideology followed by a retraction or fact checking tends to result in strengthening the false info. The recalled “facts” are those necessary to defend conclusions. It appears to work this way for both liberals and conservatives. The mechanics appear to include remembering the false information because they agree and not the correction because they disagree.

Even before I ran across this through to the present, I try to expose my self to Libertarian, Republican, Green, and Democrat information sources. I find myself dismissing some things and then armed with the ideas above feel bad about having done so. So I dig for more information and sometimes find I was wrong. Doing this is hard. It is far easier to just assume I was already correct. But then I am an information glutton.

Seth said, “What we don’t need are mere clerks who guard dead paper.” Whenever, I read “mere”, “only”, or “just” as a descriptor, it makes me sad someone (even me) relies on obvious straw men.

Librarians already do more than guard dead paper. It just makes it easier to knock them down and kick them while they are down to portray them as such. Of course, the point is that Seth wants to see “… a librarian who can bring domain knowledge and people knowledge and access to information to bear…” which describes… every… librarian… I have ever known going back to age 5. Maybe growing up in and working in libraries gives me a different perspective than Seth?

The librarians I know…

  • Help patrons learn how to find information.
  • Learn quickly what the patron knows and how to connect the dots.
  • Have a master’s or doctorate in librarian (information) science but an undergraduate in something else because almost no where offers a bachelor’s in it.

How about this? “What we don’t need are mere scribes who throw words on paper. I want to see an author who can bring domain knowledge and people knowledge and communicate  information.” Yeah. Still just as demeaning without being at all helpful.

Earlier today, Blackboard announced the keynote will be given by Anya Kamenetz, author of DIY U as the DevCon keynote. It continues the tradition of ironic keynote speakers in even years:

  • 2008 Michael Wesch who spoke on how the traditional one-to-many classroom model isn’t good for helping students learn. The two LMS products Blackboard makes continue the one-to-many model online. He advocated using free online Web 2.0 tools to aggregate the information students collectively relevant research and provide to the many-to-many class discussion.
  • 2006 David Weinberger who spoke on how digitalization changes how we organize information. He was previously a contributor to The Cluetrain Manifesto, whose point was corporations need to have honest conversations with customers because we do talk to each other and discover deception.

How does DIY U continue the irony in 2010? Well, the idea is to get rid of the education model where students solely look to experts (aka professor) to provide information. Students use the abundance of information available online for free such as OpenCourseWare and use the experts to give practical application experience. An LMS is designed to place the expert (the instructor role) as the provider of the information, the exact opposite of what Anya advocates.

Ideally, Blackboard arranges these to pressure themselves to adapt to the changing landscape.

If so, then based on the 2006 keynote, Blackboard should have a culture of engineers and developers willing to frankly talk to me about the products. They should be hanging out on the email lists where I seek peer solutions offering their own given their insider access. They should be on Twitter. There are a few who do this, but they are by far rare.

I’ve already argued how the LMS is Web 1.5 not 2.0.

Maybe in 2012.

This Linux tool is my new best friend. We get thousands of XML files from our clients for loading user, class, and enrollment information. Some of these clients customize our software or write their own software for generating the XML.

This means we frequently get oddities in the files which cause problems. Thankfully I am not the person who has to verify these files are good. I just get to answer the questions that person has about why a particular file failed to load.

The CE/Vista import process will stop if its validator finds invalid XML. Unfortunately, the error “An exception occurred while obtaining error messages.  See webct.log” doesn’t sound like invalid XML.

Usage is pretty simple:

xmllint –valid /path/to/file.xml | head

  1. If the file is valid, then the whole file is in the output.
  2. If there are warnings, then they precede the whole file.
  3. If there are errors, then only the errors are displayed.

I use head here because our files can be up to 15MB, so this prevents the whole file from going on the screen for the first two situations.

I discovered this in researching how to handle the first situation below. It came up again today. So this has been useful to catch errors in the client supplied files where the file failed to load.

1: parser error : XML declaration allowed only at the start of the document
 <?xml version=”1.0″ encoding=”UTF-8″?>

162: parser error : EntityRef: expecting ‘;’
<long>College of Engineering &amp&#059; CIS</long>

(Bolded the errors.) The number before the colon is the line number. The carat it uses to indicate where on the line an error occurred isn’t accurate, so I ignore it.

My hope is to get this integrated into our processes to validate these files before they are loaded and save ourselves headaches the next morning.

When I read something like this, I start to question the validity of the method.

Psychologist Sam Gosling analyzed the Facebook profiles of 236 college-aged people, who were also asked to fill out personality questionnaires… surveys that were designed to assess not only how study participants viewed themselves in reality, but also what their personalities would be like if they had all of their ideal traits.
The Psychology of Facebook Profiles | TIME

The better experiment here is to have half the participants maintain a normal Facebook profile. The other half would create a profile demonstrating their ideal self. Then compare those against the Big Five questionnaire looking at both. The list of personality traits in the article “openness, agreeableness, conscientiousness, extraversion and neuroticism” gives away the test used despite not explicitly named. Of course, I’m no fan of the Big Five.

Should the results match you can say Facebook reveals whatever the Big Five measures. However, I’d be uncomfortable saying any instrument measuring self-reported information accurately reflected anything about a person’s real personality.

Rather than depend on end users to accurately report the browser used, I look for the user-agent in the web server logs. (Yes, I know it can be spoofed. Power users would be trying different things to resolve their own issues not coming to us.)

Followers of this blog may recall I changed the Weblogic config.xml to record user agents to the webserver.log.

One trick I use is the double quotes in awk to identify just the user agent. This information is then sorting by name to count (uniq -c) how many of each is present. Finally, I sort again by number with the largest at the top to see which are the most common.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | sort | uniq -c | sort -n -r

This is what I will use looking for a specific user. If I am looking at a wider range, such as the user age for hits on a page, then I probably will use the head command to look at the top 20.

A “feature” of this is getting the build (Firefox 3.011) rather than just the version (Firefox 3). For getting the version, I tend to use something more like this to count the found version out of the log.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | grep -c ‘<version>’

I have yet to see many CE/Vista URIs with the names of web browsers. So these are the most common versions one would likely find (what to grep – name – notes):

  1. MSIE # – Microsoft Internet Explorer – I’ve seen 5 through 8 in the last few months.
  2. Firefox # – Mozilla Firefox – I’ve seen 2 through 3.5. There is enough difference between 3 and 3.5 (also 2 and 2.5) I would count them separately.
  3. Safari – Apple/WebKit – In searching for this one, I would add to the search a ‘grep -v Chrome’ or to eliminate Google Chrome user agents.
  4. Chrome # – Google Chrome – Only versions 1 and 2.

Naturally there many, many others. It surprised me to see iPhone and Android on the list.

CE/Vista Reports and Tracking displays summaries of activity. If an instructor seeks to know who clicked on a specific file, then Reports and Tracking falls down on the job.

Course Instructor can produce a report of the raw tracking data. However, access to the role falls under the Administration tab so people running the system need to make a user specifically to enroll themselves at the course level to get the reports. (Annoying.)

Instead the administrators for my campuses pass up to my level of support requests to generate reports. For providing these I have SQL to produce a report. This example is for users who clicked on a specific file. Anything in bold is what the SQL composer will need to alter.

set lines 200 pages 9999
col user format a20
col action format a32
col pagename format a80

clear breaks computes
break on User skip 1
compute count of Action on User

select tp.user_name "User",ta.name "Action",
      to_char(tua.event_time,'MM/DD/RR HH24:MI:SS') "Time",
      NVL(tpg.name,'--') "PageName"
  from trk_person tp, trk_action ta, trk_user_action tua,
      trk_page tpg, learning_context lc
  where tp.id = tua.trk_person_id
    and ta.id = tua.trk_action_id
    and tua.trk_page_id = tpg.id (+)
    and tua.trk_learning_context_id = lc.id
    and lc.id = 1234567890
    and tpg.name like '%filename.doc%'
  order by tp.user_name,tua.event_time
/

Output

  • User aka tp.user_name – This is the student’s account.
  • Action aka ta.name – This is an artifact of the original script. You might drop it as meaningless from this report.
  • Time aka tua.event_time – Day and time the action took place.
  • PageName aka tpg.name – Confirmation of the file name. Keep if using like in a select on this.

Considerations

I use the learning context id (lc.id aka learning_context.id) because in my multi-institution environment, the same name of a section could be used in many places. This id ensures I data from multiple sections.

The tricky part is identifying the file name. HTML files generally will show up as the name of in the title tag (hope the instructor never updates it). Office documents generally will show as the file name. Here are a couple approaches to determining how to use tpg.name (aka trk_page.name).

  1. Look at the file in the user interface.
  2. Run the report without limiting results to any tpg.name. Identify out of the results the name you wish to search and use: tpg.name = ‘page name

Most tracked actions do have a page name. However, some actions do not. This SQL is designed to print a “–” in those cases.

On the BLKBRD-L email list is a discussion about proving students are cheating. Any time the topic comes up, someone says a human in a room is the only way to be sure. Naturally, someone else responds with the latest and greatest technology to detect cheating.

In this case, Acxiom offers identity verification:

By matching a student’s directory information (name, address, phone) to our database, we match the student to our database. The student then must answer questions to verify their identity, which may include name, address and date of birth.


The institution never releases directory information so there are no Family Educational Rights and Privacy Act (FERPA) violations.

However, to complete the course work the student is forced to hand over the information to Acxiom, an unknown and potentially untrusted party. Why should students trust Acxiom when institutions cannot be trusted?

Due to the decentralized nature of IT departments, higher education leads all industries in numbers data breach events. Acxiom’s verification capabilities were designed so that student and instructor privacy is a critical feature of our solution. Institutions never receive the data Acxiom uses in this process. They are simply made aware of the pass/fail rates.

In other words, high education institutions cannot be trusted to handle this information. No reason was provided as to why Acxiom can be better trusted. Guess the people reading this would never check to see whether Acxiom has also had data breaches.

This Electronic Freedom Foundation response to Acxiom’s claims their method is more secure was interesting:

True facts about your life are, by definition, pre-compromised. If the bio question is about something already in the consumer file, arguably the best kind of question is about something that is highly unlikely to be in one’s consumer file and even useless commercially–like my pet’s name.

Answering these kinds of questions feels like more of violation of than a preservation of privacy.

« Older entries