Tuesday, 19 February 2008

2006_10_01_archive



Google Tech-Talk Computer Science Video Lectures

Hey everyone! This time I am posting google tech-talk computer science

video lectures which I found interesting.

There are many, many more google tech-talk video lectures available

here:

Google Tech-Talk Lectures

2,3,5, Infinity!

* Video Lecture by Paul Hildebrandt

ABSTRACT: Nearly 60 years after the first electronic digital computer

was designed at the Princeton Institute for Advanced Studies (IAS),

companies like Google are demonstrating the power of a world built

from 1s and 0s. Zome is a system that models the world built from the

numbers 2, 3 and 5. We will explore how these numbers are knotted

together to form the structure of space, from the subatomic framework

of the atom, to the geometry of life, to a recently proposed "shape"

of the universe!

Creative Commons for Googlers

* Video Lecture by Mike Linksvayer

ABSTRACT: Creative Commons provides tools that enable the legal

sharing and re-use of creative and educational materials online. Come

learn about Creative Commons, what they're doing, and how Google might

help. Creative Commons' general counsel will be on hand to answer

questions about CC copyright licenses and other legal issues, but the

presentation will focus on technical projects at Creative Commons:

license-aware web search, microformats, reliable metadata-embedding in

various media types, and licensing integration with user generated

content platforms.

iClaustron: Open Source Grid Cluster Storage Controller

* Video Lecture by Mikael Ronstrom

ABSTRACT: Many applications has requirements to store petabytes of

base data and many terabytes of structured data. Examples of this are

genealogy, astronomy, biotech and so forth. This talk will discuss

requirements from the genealogy application and show how this

requirements requires building very large clustered systems with an

hierarchy of clusters. These clusters are used to both store base data

and structured data. He goes on to show how these requirements

translate into a systems architecture with essential components of

off-the shelf servers, cheap storage, clustered software and

integrated cluster interconnects.

The Technology Behind Debian's Testing Release

* Video Lecture by Anthony Towns

ABSTRACT Current Debian Project Leader, former Release Manager and all

round good guy, Anthony "aj" Towns will give an in depth look at the

ideas and code that hold Debian's "testing" suite together, from its

initial genesis, through basic prototypes, to the "final"

implementation and the couple of rewrites it's had since. The numerous

optimisations used to make the ideas actually operate in an even

vaguely acceptable amount of time would be examined; and the various

tricks and tools used in development and debugging will be examined

(including malloc debugging, writing C extensions to perl and python,

and libapt versus libdpkg).

Better, faster, smarter: Python yesterday, today ... and tomorrow

* Video Lecture by Alex Martelli

A lecture on Python programming language. Emphasis on Python

implementation 2.5 but also a historical review of 2.2, 2.3, 2.4.

Security is Broken

* Video Lecture by Rik Farrow

ABSTRACT: Our computer security model is broken. Worse yet, it never

really has worked at all well, and is even less suitable for today's

uses. In this talk, I explore the history behind the design of the

current security both in hardware and operating systems. Instead of

evolving a more secure model over time, system designers have actually

managed to make things worse, creating insecurity in depth. Most of

today's systems are single user machines: certainly desktops and

laptops, but also most servers. The current security model was not

designed to protect users from themselves, and this goes a long way

towards understanding why security is so difficult. I end by looking

at strategies for improving security -- but no real solutions. The

point is to start thinking outside of the box, while adopting best

practices today. What we have done in the past has not worked, and can

not work. We need to look at the security model in a new way, and that

is the real point of this presentation.

High Radix Interconnection Networks

* Video Lecture by William J. Dally

ABSTRACT: High-radix interconnection networks offer significantly

better cost/performance and lower latency than conventional

(low-radix) topologies. Increasing radix is motivated by the

exponential increase in router pin bandwidth over time. Increasing the

radix or degree of a router node is a more efficient way to exploit

this increasing bandwidth than making channels wider. A high-radix

poses several challenges in router design because the internal

structures of conventional routers (e.g., the allocators) scale

quadratically with radix. A hierarchical switch organization with

internal buffering yields a scalable design with near-optimal

performance. A high-radix "flattened butterfly" topology, enabled by

recent developments in global adaptive routing, offers twice the

performance as a comparable-cost Clos network on balanced traffic.

Many of these developments have been incorporated in the YARC router

and interconnection network for the Cray Black Widow Supercomputer.

Data Representation/Laplace Operator

* Video Lecture by David Vladimirovich Ingerman

ABSTRACT: Data Representation by Graphs, Matrices, Formulas, and

continued Fractions and Inverse Problems for Laplace Operator.

Using Statistics to Search and Annotate Pictures

* Video Lecture by Nuno Vasconcelos

ABSTRACT: The last decade has produced significant advances in

content-based image retrieval, i.e. the design of computer vision

systems for image search.

I will review our efforts in the area, with emphasis on the subject of

semantic retrieval. This consists of learning to annotate images, in

order to support natural language queries. In particular, I will argue

for a retrieval framework which combines the best properties of

classical "query by visual example" (QBVE), and more recent semantic

methods, and which we denote as "query by semantic example" (QBSE).

While simple, we show that, when combined with ideas from multiple

instance learning, this framework can be quite powerful. It improves

semantic retrieval along a number of dimensions, the most notable of

which is generalization (out-of-vocabulary queries). It can also be

directly compared to query by example, making it possible to quantify

the gains of representing images in semantic spaces.

Our results show that these gains are quite significant, even when the

semantic characterization is noisy and somewhat unreliable. This

suggests an interesting hypothesis for computer vision: that it may

suffice to adopt simple visual models, as long as they operate at

various levels of abstraction and are learned from large amounts of

data.

Badvertisements: Stealthy Click Fraud with Unwitting Accessories

* Video Lecture by Dr. Markus Jakobsson

ABSTRACT: We describe a new type of threat to the Internet

infrastructure, in the shape of a highly efficient but very well

camouflaged click-fraud attack on the advertising infrastructure, not

using any type of malware. The attack, which we refer to as a

"badvertisement", is described and experimentally verified on several

prominent advertisement schemes. This stealthy attack can be thought

of as a threatening mutation of spam and phishing attacks, with which

it has many commonalities, except for the fact that it is not the

targeted individual who is the victim in the attack, but the

advertiser.

Decision Making and Chance

* Video Lecture by Dr. Mike Orkin

ABSTRACT: Certain gambling games, such as roulette and craps, are

games of pure chance: In repeated play, luck disappears, and the

persistent gambler will go broke. Other gambling activities, such as

betting on sports or the stock market, may involve an element of

skill. One way to measure this is to compare the results of a gambling

strategy with chance: A skillful strategy should produce long-run

results that are better than would be achieved by someone who is just

guessing. One can also compare a gambler's losses with chance to see

if the gambler is doing worse than chance would allow. I will discuss

two recent projects that illustrate these concepts:

* Automated data mining software discovers that the Baltimore Ravens

are 17-3 versus the point spread when they lost their previous

game and their opponents played their previous game on the road.

Do situations like this give clever gamblers an edge or are such

strong win-loss records merely random flukes?

* A gambler loses $30 million betting at an online casino. Is it

possible to lose this much just by chance or is the gambler being

cheated? Or maybe the gambler is part of a money laundering

scheme.

The Electric Sheep and their Dreams in High Fidelity

* Video Lecture by Scott Draves a.k.a. Spot

ABSTRACT:

Electric Sheep is a distributed screen-saver that harnesses idle

computers into a render farm with the purpose of animating and

evolving artificial life-forms known as sheep. The votes of the users

form the basis for the fitness function for a genetic algorithm on a

space of abstract animations. Users also may design sheep by hand for

inclusion in the gene pool.

This cyborg mind composed of 35,000 computers and people was used to

make Dreams in High Fidelity: a painting that evolves. It consists of

55GB of high definition sheep that would have taken one computer over

100 years to render, played back to form a nonrepeating continuously

morphing image.

The talk will cover the genetic code and renderer, the genetic

algorithm, how error correction is built into the distributed renderer

while minimizing performance penalty, and how to distribute 750GB of

video per day without paying for it. The talk will include a demo of

the artwork.

ReUsable Web Components with Python and Future Python Web Development

* Video Lecture by Ben Bangert

ABSTRACT: Python's Web Server Gateway Interface (WSGI) not only

enables a multitude of Python web frameworks to share code when it

comes to deployment, but also enables entirely new levels of re-use

for Python web development. This talk is focused on explaining WSGI,

new types of re-use with WSGI middleware, and explore new frameworks

that heavily utilize WSGI; in this case, Pylons. Moving beyond

monolithic frameworks that try to do everything themselves, to new

modes of development where you can use just the parts you want and

still have active development communities to interact with.

Nanowires and Nanocrystals for Nanotechnology

(not computer science but too interesting to miss)

* Video Lecture by Yi Cui

ABSTRACT: Nanowires and nanocrystals represent important nanomaterials

with one-dimensional and zero-dimensional morphology, respectively.

Here I will give an overview on the research about how these

nanomaterials impact the critical applications in faster transistors,

smaller nonvolatile memory devices, efficient solar energy conversion,

high-energy battery and nanobiotechnology.

Measuring Programmer Productivity

* Video Lecture by Vikram Aggarwal Viral Shah

ABSTRACT: Developers have been programming for the last 30 years in a

wide variety of programming languages. Over the years, we have all

developed a feeling for what it is in a programming language that

makes us productive as programmers. As part of the DARPA HPCS (High

Productivity Computing Systems) program, we are developing models and

tools to measure programmer productivity. We will describe our data

gathering process, and our effort to model programmer workflows using

timed markov models. timed markov models.

Sparse and large-scale learning with heterogeneous data

* Video Lecture by Gert Lanckriet

ABSTRACT: An important challenge for the field of machine learning is

to deal with the increasing amount of data that is available for

learning and to leverage the (also increasing) diversity of

information sources, describing these data. Beyond classical vectorial

data formats, data in the format of graphs, trees, strings and beyond

have become widely available for data mining, e.g., the linked

structure of the world wide web, text, images and sounds on web pages,

protein interaction networks, phylogenetic trees, etc. Moreover, for

interpretability and economical reasons, decision rules that rely on a

small subset of the information sources and/or a small subset of the

features describing the data are highly desired: sparse learning

algorithms are a must. This talk will outline two recent approaches

that address sparse, large-scale learning with heterogeneous data, and

show some applications.

Code Generation With Ruby

* Video Lecture by Jack Herrington

Talk about code generation techniques using Ruby. He will cover both

do-it-yourself and off-the-shelf solutions in a conversation about

where Ruby is as a tool, and where it's going.

Random Sampling from a Search Engine's Index

* Video Lecture by Ziv Bar-Yossef

ABSTRACT: We revisit a problem introduced by Bharat and Broder almost

a decade ago: how to sample random pages from a search engine's index

using only the search engine's public interface?

In this paper we introduce two novel sampling techniques: a

lexicon-based technique and a random walk technique. Our methods

produce biased sample documents, but each sample is accompanied by a

corresponding "weight", which represents the probability of this

document to be selected in the sample. The samples, in conjunction

with the weights, are then used to simulate near-uniform samples. To

this end, we resort to three well known Monte Carlo simulation

methods: rejection sampling, importance sampling and the

Metropolis-Hastings algorithm.

We analyze our methods rigorously and prove that under plausible

assumptions, our techniques are guaranteed to produce near-uniform

samples from the search engine's index. Experiments on a corpus of 2.4

million documents substantiate our analytical findings and show that

our algorithms do not have significant bias towards long or highly

ranked documents.

A New Way to look at Networking

* Video Lecture by Van Jacobson

ABSTRACT: Today's research community congratulates itself for the

success of the internet and passionately argues whether circuits or

datagrams are the One True Way. Meanwhile the list of unsolved

problems grows.

Security, mobility, ubiquitous computing, wireless, autonomous

sensors, content distribution, digital divide, third world

infrastructure, etc., are all poorly served by what's available from

either the research community or the marketplace. I'll use various

strained analogies and contrived examples to argue that network

research is moribund because the only thing it knows how to do is fill

in the details of a conversation between two applications. Today as in

the 60s problems go unsolved due to our tunnel vision and not because

of their intrinsic difficulty. And now, like then, simply changing our

point of view may make many hard things easy.

Privacy Preserving DataMining

* Video Lecture by Matthew Roughan

ABSTRACT: The rapid growth of the Internet over the last decade has

been startling. However, efforts to track its growth have often fallen

afoul of bad data --- for instance, how much traffic does the Internet

now carry? The problem is not that the data is technically hard to

obtain, or that it does not exist, but rather that the data is not

shared. Obtaining an overall picture requires data from multiple

sources, few of whom are open to sharing such data, either because it

violates privacy legislation, or exposes business secrets. The

approaches used so far in the Internet, e.g., trusted third parties,

or data anonymization, have been only partially successful, and are

not widely adopted.

The paper presents a method for performing computations on shared data

without any participants revealing their secret data. For example, one

can compute the sum of traffic over a set of service providers without

any service provider learning the traffic of another. The method is

simple, scalable, and flexible enough to perform a wide range of

valuable operations on Internet data.

Near-optimal Monitoring of Online Data Sources

* Video Lecture by Ryan Peterson

ABSTRACT Crawling the Web for interesting and relevant changes has

become increasingly difficult due to the abundance of frequently

changing information. Common techniques for solving such problems make

use of heuristics, which do not provide performance guarantees and

tend to be tailored to specific scenarios or benchmarks.

In this talk, I will present a principled approach based on

mathematical optimization for monitoring high-volume online data

sources. We have built and deployed a distributed system called Corona

that enables clients to subscribe to Web pages and notifies clients of

updates asynchronously via instant messages. Corona assigns multiple

nodes to cooperatively monitor each Web page and employs a novel

decentralized optimization technique for distributing the monitoring

load. In its currently running form, the optimization algorithm

guarantees the best update detection time on average without exceeding

resource constraints on the monitoring servers. Based on simulations

and measurements on our deployed system, I will show that Corona

performs substantially better than commonly used heuristics.

Related Posts

* Free Computer Science Video Lecture Courses

(Courses include web application development, lisp/scheme

programming, data structures, algorithms, machine structures,

programming languages, principles of software engineering, object

oriented programming in java, systems, computer system

engineering, computer architecture, operating systems, database

management systems, performance analysis, cryptography, artificial

intelligence)

* More Mathematics and Theoretical Computer Science Video Lectures

(Includes algebra, elementary statistics, applied probability,

finite mathematics, trigonometry with calculus, mathematical

computation, pre-calculus, analytic geometry, first year calculus,

business calculus, mathematical writing (by Knuth), computer

science problem seminar (by Knuth), dynamic systems and chaos,


No comments: