Monday Jan. 29 Anticipated Syllabus

__Monday
Feb. 5 Fundamentals of Scientific Computation (Rob
Manning____)__

__Thursday
Feb. 8 Fundamentals of Scientific Computation (Lyle Roelofs &
Julio dePaula)__

Monday Feb. 12 Fundamentals of Scientific Computation (Rob Manning)

Thursday Feb. 15 N-body Dynamics (Dave Wonnacott)

Thursday Feb. 22 N-body Dynamics (Dave Wonnacott)

Monday Feb. 26 Molecular Modeling (Julio dePaula, Suzanne Amador Kane & Rob Manning)

Thursday Mar. 1 Genomics (Curtis Greene, Philip Meneely & Jenni Punt)

Monday Mar. 5 Monte Carlo Model (John Dougherty & Rob Manning)

Thursday Mar. 8 Image Analysis (Lyle Roelofs)

Monday Mar. 19 Predator Prey Models (Kate Heston)

Thursday Mar. 22 Molecular Modeling #2 (Rob Manning & Suzanne Amador Kane)

Monday Mar. 26 Molecular Modeling #3 (Suzanne Amador Kane & Julio dePaula)

Thursday Mar. 29 N-Body Dynamics #3 (David Wonnacott)

Monday Apr. 2 Monte Carlo Model #2 (Rob Manning & John Dougherty)

Thursday Apr. 5 Round Table Discussion of 'How is this course progressing'?

Monday Apr. 9 Genomics #2 (Curtis Greene, Philip Meneely & Jenni Punt)

Thursday Apr. 12 Genomics #3 (Curtis Greene, Philip Meneely & Jenni Punt)

Monday Apr. 16 Genomics #4 (Curtis Greene, Philip Meneely & Jenni Punt)

Thursday Apr. 19 Genomics #5 (Curtis Greene, Philip Meneely & Jenni Punt)

Thursday Apr. 19 Predator-Prey Model #2 (Kate Heston)

Monday Apr. 23 Image Analysis #2 (Lyle Roelofs)

Thursday May 3 CAtS Roundtable discussion: An Overview

We set up our schedule for the semester.

The anticipated syllabus for pre-CAtS is:

I. Fundamentals of Scientific Computation

Issues in modeling scientific problems: discretization, errorsComputational issues: basic algorithmic structures, parallelization

Introduction to Mathematica

Internet (and other) sources for scientific computation

(We envision beginning pre-CAtS with presentations in these areas primarily by the co-organizers, supplemented by John Dougherty for parallelization)

II. Genomics and Database Searching

Genomics (Phil Meneely)Differential Display Analysis (Jenni Punt)

Mathematical algorithms (Curtis Greene)

III. Dynamic modeling

Predator-Prey models (Kate Heston)Galactic dynamics (Dave Wonnacott)

Fluid Flow (Jerry Gollub)

IV. Statistical modeling

Statistical simulation of biopolymers (Suzanne Amador Kane)Monte Carlo/Simulated annealing (Rob Manning, John Dougherty)

V. Image analysis

Analysis of radio astronomy data (Lyle Roelofs)

VI. Optimization

Molecular Mechanics (Julio de Paula)

Lyle reported that there are ongoing discussions concerning integrating our computational and numeracy requirements across the science and math majors. We discussed what the basic contents of such a single course might be.

The following is a summary prepared by Lyle:

An Applied Calculus course for Natural Science Majors

Participants in the faculty seminar in Computing Across the Sciences identified the following fundamental concepts in applied mathematics as broadly useful for their majors. A course featuring these concepts might fruitfully replace a semester of calculus in the first year for likely majors in astronomy, biology, chemistry and physics.

Understanding graphs of functions

- relation of slope to derivative- meaning of integration

How to represent data graphically

How to fit equations to data

Meaning of a differential equation and some simple examples giving

- wavelike solutions- growing and decaying exponentials

- rate laws for chemical kinetics

Meaning of distributions and some of the most common forms

- error analysis- mean and standard deviation

Basic ideas of probability

To achieve an interesting course one would want to weave actual applications from the sciences into the teaching of these basic mathematical concepts.

We have not yet considered what the shared interests might be in computational techniques and technology.

- Attended by:
- Lynne Butler, Math
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Jerry Gollub, Physics
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Fundamentals of Scientific Computation

Presented by: Rob Manning

Mathematica files:

The notebook showed during our meeting.

The standard introduction to Mathematica notebook that I give to students (use as a tutorial).

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Jerry Gollub, Physics
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Fundamentals of Scientific Computation & Integrated Science Infrastructure Courses

Presented by: Lyle Roelofs and Julio dePaula

Due to Jury duty Rob Manning was unable to make his scheduled presentation on some platforms for the modules to be developed. That will be presented on 02/12/02.

Lyle led further discussion of Rob's document on the fundamentals of scientific computation, completing the last two topics.

Julio updated the participants on the ongoing discussions on an integrated introduction to lab skills, scientific literacy and numeracy. The options of integrating such a course into the freshman seminar program or developing modules to be used in the various introductory courses were debated by the group.

Jerry Gollub requested a change of status to that of "auditor". He regrets that he will not be able to develop the Dynamic Model Fluid Flow module.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Jerry Gollub, Physics
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Fundamentals of Scientific Computation

Presented by: Rob Manning

Rob demonstrates some platforms:

Mathematica Program

AdvantagesNotebook format allows intermingling of text, computation and graphicsPalettes available to make the commands look more like mathematical expressions

Can do large precision

Good symbolic manipulation

Simplifies expressions

Built-in numerical routines

Disadvantages

Import-Export of data is hardSyntax issues

Unhelpful error messages

Programming environment

OOP is enforced

Not optional for longer programmes

ODE Architect (PC only)

One network server - we have 20 copies

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Jerry Gollub, Physics
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: N-body Dynamics

Presented by: Dave Wonnacott

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- David Wonnacott, Computer Science

Presentation Topic: N-body Dynamics

Presented by: Dave Wonnacott

Dave Wonnacott combined his presentation of a module on galactic dynamics. His basic module shows the motion of 2 or 4 masses with the associated equations and program statements.

- Extension
- Physics - other effects
- Numerical methods
- Computational issues
- Measuring "complexity" - time needed for computation as a function of number of masses

Other approaches

- Particles
- Mesh
- Tree (adaptive mesh)

Related problems

Overview question: What activities should students be doing?

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Molecular Modeling

Presented by: Suzanne Amador Kane, Rob Manning & Julio dePaula

Computational and Mathematical Background - Rob Manning

· Matrix algebra / eigenvalues· Ordinary Differential Equations (ODE's) including methods for solving using discrete math

· Optimization

· Simulated annealing/Monte Carlo methods for these applications

· Statistical methods for pattern matching

Molecular Mechanics / Force-field descriptions - Suzanne Amador Kane

· Effective energies for doing molecular mechanics/dynamics calculations· Empirical parameters for describing energy function

· Pseudo atoms

· Solvent effects

· Steric energy

· Dipeptide project

· Protein visualization

Molecular Dynamics - Julio de Paula

· Definition of temperature in Molecular dynamics· Implementing Newtonian equations of motion

· Dipeptide (short polypeptide) project

· Basic introduction to quantum / normal modes, simple harmonic oscillator (SHO)

Statistical Methods

· Structure prediction using homology searches or other statistical methods· Relates to Genomics module

Software Packages:

- Hyperchem: good for displaying graphics of molecular models; animations of normal modes possible, can do molecular dynamics calculations. Available on one computer in Chemistry--contact Julio. Can be networked and put on multiple PC's.
- Spartan: good for quantum calculations, better graphics than Hyperchem; good for animations of normal modes, good for smaller molecules; no molecular dynamics and little control over molecular mechanics calculations
- Rasmol: readily available on computer clusters, but only for visualization. Good for that, though.
- Insight/Discover: available only on Silicon Graphics workstation in Sharpless, so only one high-end machine. Research-quality calculations, relatively up-to-date molecular mechanics and molecular dynamics. Very flexible, fully configurable. Excellent and very flexible graphs. Cannot be expanded onto a PC cluster.

Web Resources

There are many helpful web pages and web resources which are collected together at links available at FASEB, Protein Society and other sites. The Biophysical Society's Online Textbook has many articles and chapters on these topics also.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Genomics

Presented by: Curtis Greene, Philip Meneely & Jenni Punt

Jenni Punt presented:

- the biological basis of computational sequence
- case study on B cell follicular cancer which results from a DNA translocation
- the genetic for converting 64 bits of RNA information into one of the 20 amino acids
- alignment of sequences using BLAST which compares to a library that has now reached 100,000 entries.

Curtis Greene presented:

- Mathematica implementation of algorithms that are used in the sequence alignment search
- showed and explained a scoring matrix
- ran Mathematica versions of the BLAST algorithms

Philip Meneely presented:

- word size issue in how BLAST works
- biological basis of scoring
matrices
- PAM (Percent Accepted Mutations) first developed

Blosum (Block Substitution Basis) now more widely used. Calculated from a block of distantly related polypeptides.

- Beyond BLAST
- DNA & RNA alignments
- Context based alignment
- Structure based alignment
- Functional assays -- expression, interaction, etc.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Jerry Gollub, Physics
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Monte Carlo Model

Presented by: John Dougherty & Rob Manning

Note: due to ice, snow and school closures attendance was low.

Rob Manning on Monte Carlo

- Generalization of random numbers in Mathematica
- Simulating random processes
- random width
- polymer chains

- Metropolis Monte Carlo
- Simulated annealing

John Dougherty on Parallel Computing

- Motivation
- Characterizing the gains for going parallel
- Implement by clustering Linux machines
- Run test problems and compare to models of efficiency
- Alternative examples

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Lyle Roelofs, Physics

Presentation Topic: Image Analysis

Presented by: Lyle Roelofs

Note: Due to Spring Break the next meeting will be Monday March 19, 2001.

Lyle presented an overview of image analysis giving examples from science research in several fields. Important issues include sharpness, background noise and edge detection. He showed how global transforms were useful for focusing in on different aspects of the image.

He has found a guest lecturer, Lawrence O'Gorman, who resides close by, is one of the authors of an important textbook in this area and is willing to visit to give a talk on biometrics applications.

Software presented in Image Analysis included:

NIHImage/Scion (free software) - this is an elementary program with limitationsLinux (currently available in Stokes 8 cluster) - offers a pixel by pixel analysis

Lyle demonstrated Scion, a free image analysis program supported by NIH, on a Dendritic Crystal Growth Image.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Jerry Gollub, Physics
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Predator Prey Models

Presented by: Kate Heston

Kate introduced the 2 Lotka-Volterra equations. They represent models for fluctuations of predator and prey being interrelated. She introduced ODE Architect (a software program) and displayed these 2 equations. ODE features and graphs were shown.

A nice feature about this program is that it offers different initial conditions.A draw back is this program is PC (not MAC) friendly.

ODE Architect offers population model tutorials which are easy to use. Dave Wonnacott would like to use ODE Architect at the beginning of the course as an introduction to Differential Equations. Have the students familiarize themselves with it so they can translate it into the rest of the course.

There was a group discussion on what the focus should be on round two of the presentations. Consensus is each presentation topic needs to provide specific assignments and lab exercises.

Initially the course will consist of 2 hours of lecture (weekly) and 1-2 hours instructional lab (weekly). It will be mostly directed with some intellectual material. Then the students will be questioned to see how well they have assimilated the material.

Dave Wonnacott would like overheads, handouts, etc. available from each module.

Students will be expected to do presentations in Part B of the course.

Platform Logistics: Stokes Lab for PC users and Kate Heston's lab for Macintosh users.

Prerequisite is 2 semesters of Calculus.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Phil Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- David Wonnacott, Computer Science

Presentation Topic: Molecular Modeling#2

Presented by: Rob Manning and Suzanne Amador Kane

Rob presented Optimization & Minimization

- Optimization motivation
- Mathematica Notebook to show
optimization
- 1 parameter case done by analytic derivative
- many parameter case must be done numerically
- Newton's method
- Find minimum in Mathematica
- Multidimensional problems
- End module 1 with multiple minimum problem

Suzanne presented Molecular Mechanics

- Molecular mechanics applied using HyperChem to treat a Dipeptide structure problem
- Look at structural dependence on which residue is included in the molecule
- Compare energy minima associated with beta sheets and alpha helices
- will continue with HyperChem presentation on Monday

**Participating faculty should
provide a list of text references for students to use.**

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Molecular Modeling#3

Presented by: Suzanne Amador Kane & Julio de Paula

Suzanne presented Molecular Mechanics

- distributed sample assignments for molecular mechanics
- HyperChem demo
- it can build molecules and rotate them and display them in various styles
- it handles energetics well and can optimize structures and pull out potential energy functions for individual coordinates

Julio presented Molecular Dynamics

- Molecular Dynamics introduction
- intro to the motions of actual molecules, time scales, etc.
- special techniques
- Quenched dynamics & Annealing

- student projects
- quenched dynamics of ethane
- dynamics of Dipeptide
- other molecules
- solvent effects

- annotated bibliography

- Additional notes:
- There is a 30 day free trial of HyperChem
at www.hyper.com
- This course will need dedicated computers that can run more complicated calculations on Hyper Chem.
- All presentation material displayed will be Password protected to accommodate copyright issues.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: N-body Dynamics#3

Presented by: David Wonnacott

The focus on the 3rd presentation on Galactic Dynamics was the nitty gritty of the background mathematics.

Dave presented a Mathematica notebook that takes the student from an apple falling to the earth up to galactic dynamics, step-by-step.

The group members made many comments concerning notation and the best way to present the underlying mathematics.

An important point is unifying this module with Kate Heston's (Predator-Prey models), in terms of notation and approach. A uniform use of computational symbols is necessary.

- Attended by:
- Curtis Greene, Math
- Julio dePaula, Chemistry
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology Lyle Roelofs, Physics
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Monte Carlo Model #2

Presented by: Rob Manning & John Dougherty

**Rob presented on Monte
Carlo:**

I. Random #'s - using Mathematica

- Discrete Random numbers
- Mathematica tutorials were offered
- Histograms
- exercises and examples were
offered
- loaded die, relates to random genomics sequences

- Continuous Random numbers
- Mathematica tutorials were offered
- exercises and examples were
offered
- normal distribution

II. Simulations

- apply energy functions
- discrete simulations: random walk in 1D example, students do 2D as exercise
- continuous simulations: Boltzmann distribution of initial velocities for a molecular dynamics simulation, students can do polymer statistics as a part B module

III. Metropolis Monte Carlo

- Mostly unchanged since 1st presentation: review metropolis algorithm examples to approximate pi, simulate double-well potential, then talk about simulating annealing, with traveling salesman problem as example.
- New idea: show use of simulated annealing in training hidden Markov Model for multiple sequence alignment in genomics?

**John Dougherty presented Parallel
Scientific Computing:**

Approximating PI

Sequential Monte Carlo

Broadcast & Gather

Preliminary Data

- displayed Monte Carlo execution times (computer processing times)
- experiments

Parallel Performance Metrics

- Speedup: S(n) = Tseq / Tpar
- this can be graphed well (Performance Profile)

- Efficiency: E(n) = S(n) / n
- this can be graphed well (Performance Profile)

- Scalability (not well defined but worth the discussion)

There are currently eight machines available to do these exercises. JD suggests a single consistent language, ideally C++ but we may need to settle for C since that is used in most parallel computing environments such as MPI or PVM (interfaces for parallel computation). The use of a network of Linux workstations is important.

A doable exercise is to give the students a pre-made parallel code and have them run it for various numbers of processors and determine the efficiency and speedups and compare to theory presented in class. Have an instructor in the lab to answer questions that come up.

There was group discussion that the course content has focused on Continuous vs. Discrete Problems. The one exception being the Traveling Salesman model. There has been some struggle determining what is an appropriate level for this course. Dave Wonnacott believes he needs to "abstract away" as much detail as possible so the students can see the significant pieces of the sequential and parallel algorithms.

- Attended by:
- John Dougherty, Computer Science
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- David Wonnacott, Computer Science

Topic: Round Table Discussion

There was a group discussion on the course development thus far. CAtS will be a second semester course for some Sophomores, but predominantly Juniors.

Expecting Dave Wonnacott to teach this course alone is unrealistic. A solution would be to have a series of guest speakers and to have Pre-CAtS faculty help teach and do demos. Faculty will be required to come up with one demo or science problem for their module. It was agreed that if this course is not problem driven then students will lose interest. (i.e., the course direction should not be algorithm or computer science driven).

There will be a 1 or 2 week introduction to bring students from a variety of majors. It will be a demonstration module to help recruit students. There should be 4-6 interesting demos. After the students are exposed to a preview of the modules then they will be required to show the best final image of the solved problems.

There will be 3 modules:

- Module 0
- types of topics to be covered
- introductory document that lists helpful background reading material/a short bibliography of reference materials
- subset of modules

- Module 1
- go through actual modules
- visit each module/topic once

- Module 2
- last four weeks of the course will be lab work

Having students work in groups of 3 or more would help them develop a deeper understanding of sciences that they're not familiar with (example; have a group of 4 students work together where one has a biology background, one a computer science background, one a physics background and one a math background). Julio dePaula has used this technique-his feedback will be useful.

Molecular Modeling has been a large stumbling block in this course.

Pre-CAtS faculty need to:

- provide one demo for their module
- provide a 2-5 page module description (introduction to module, content, demos, exercises)

There was some discussion about a "shallow understanding" teaching technique and how the students would react to the lack of detail.

Using a "Concept Map" was introduced as an ongoing assignment. Have the students come up with links and connections between the topics. As a starting point give them a list of topics to include in the map.

- Attended by:
- John Dougherty, Computer Science
- Jerry Gollub, Physics
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Genomics #2

Presented by: Curtis Greene, Philip Meneely & Jenni Punt

**Curtis Greene discussed**

- What is an alignment? (Global & Local
& Multiple)
- gaps, identities, substitutions, inserts/deletes

- Scoring and counting alignments
- pair wise

- Algorithms (exact & heuristic & hidden Markov)
- BLAST
- operation
- default

**Phil Meneely continued the
discussion**

- Flaws and improvements for alignment
scoring
- not all positions in the strings have equal significance
- HMMER (a hidden Markov model) aligns to consensus sequence

- Attended by:
- Julio dePaula
- John Dougherty, Computer Science
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Genomics #3

Presented by: Curtis Greene, Philip Meneely & Jenni Punt

**Curtis Greene
continued**

- Hidden Markov models - good for profile
matching
- model generates Markov chains
- HMM's originally defined for speech recognition
- one trains the model to the family (possibly can encounter bad local minima)
- this is a continuous problem
- then the alignment phase is a discrete optimization problem

A good source is: *Biological Sequence
Analysis*, R. Darbin, S. Eddy, A. Krogh, G. Mitchison (heavy math
prerequisite)

**Philip Meneely continued**

- Biological uses - practical
aspects
- can get family resemblance information
- can take DNA sequence and find all occurrences of a particular family
- DNA background (what does the inactive part of the DNA do?)
- evolutionary relationships

- Predicting RNA structure (stem-loop) is a
very hard problem
- stem loop structure introduces highly non-local correlation
- this requires HMM's using "transformational grammar"

- HMM can be used to find the genes in DNA sequences

- Attended by:
- John Dougherty, Computer Science
- Jerry Gollub, Physics
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Genomics #4

Presented by: Curtis Greene, Philip Meneely & Jenni Punt

**Philip Meneely discussed** Genome
projects followed by Jenni Punt discussing Gene Function

- acquisition
- assembly (joins based on 400 base 'overlaps' - Solara's approach was a little different)
- annotation
- "gene finding"
- 1-2% are active genes making proteins (the rest are regulatory or no-known function)
- approaches
- ab inition / de novo (where did transcripts come from?)
- alignment - use BLAST to find
relations based on existing information
- requires HMM for Eukaryotic DNA

- gene function (
**Jenni Punt continued**the discussion on gene function)- we know the complete sets of RNA
produced in a cell
- DNA - mRNA (message DNA) - spliced mRNA - cDNA (complimentary DNA)

- use microarrays to match cDNA from a given cell to all known genes
- this involves much image processing
- must find the patterns of gene expression

- we know the complete sets of RNA
produced in a cell

- "gene finding"

- Attended by:
- John Dougherty, Computer Science
- Jerry Gollub, Physics
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

**Presentation Topic: Genomics
#5**

Presented by: Curtis Greene, Philip Meneely & Jenni Punt

Jenni Punt demonstrated some programs she has found and they should be show to the students:

Self Organizing Maps (SOM) offer a basic introduction to algorithms

a clustering approach (each cluster can be arranged in a related way, and you don't need a grid)competitive learning program

Advantages: more exploratory analysis

Disadvantage: you have to predefine the number of clusters (although Jenni did not find this to be an issue)

Projects for students:

Basic: have them run the programsMore advanced: what changes could be imposed on the program

A bibliography list was offered.

**Presentation Topic:
Predator-Prey Model #2**

Presented by: Kate Heston

ODE Architect has good projects included. Kate demonstrated :

A simulation of two growing populations/competition models of microbesIn-depth simulations that could constitute Module II

1. additional terms in the p-p model2. an epidemiology simulation

Jerry Gollub/Jenni Punt discussed using this program in the study of AIDS propagation. It would be useful for prediction and intervention. In addition, by using the ODE Architect program you could potentially pinpoint any missing variables in the sample model you are using.

- Attended by:
- Julio dePaula, Chemistry
- John Dougherty, Computer Science
- Jerry Gollub, Physics
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

Presentation Topic: Image Analysis #2

- Presented by: Lyle Roelofs

Overview of Image Analysis #1 seminar. Displayed sources and resources. Showed outline of talk. Then continued into module #2

- Biometrics - Gel Plot from Meneely's
lab
- loaded image into SCION software to demonstrate how to locate bands accurately. He plotted a profile of a "lane" and showed the peaks and valleys.

- Biometrics - DNA microarrays
- 2 web sites
- www.gene-chips.com
- www.nhgri.nih.gov/DIR/Microarray/

- 2 web sites
- Fourier Methods - take random data, find the order and filter in Fourier Space

The Basic Idea of Fourier Analysis

Lyle illustrated a basic formula for expressing a sum of waves and applied it to True BASIC Program

Applied Fourier Methods in image analysis using SCION. A 2 dimensional Fourier Transform will find data patterns/order. Lyle then demonstrated getting rid of garbage/noise data. By finding order in 2D it helps to explain 3D (x-rays).

Lyle recommended that each member of the Pre-CAtS course bring ideas from this class to share with other Science Faculty.

- Attended by:
- Julio dePaula, Chemistry
- John Dougherty, Computer Science
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science

- Topic: Roundtable discussion of an overview of CAtS Course
- John Dougherty, Rob Manning, Philip Meneely, Lyle Roelofs & Dave Wonnacott (all Pre-CAtS participants are welcome) will meet periodically to shape up the course content.

- Some definitive decisions regarding
CAtS:
- Its first offering will be in Spring 2002
- It will be a level 400 course
(CS450)
- Two 1 hour lectures and a one hour group lab

- The course needs to fit into 10 weeks
- The course will be held Tuesdays and Thursday 10:00-11:30am
- There will be two 90 minute periods in the first week
- students will learn
- recognition skills
- what tools they need to solve the problems presented to them

- Problem driven / data driven
- 3 basic math problems - math
techniques to be covered to recognize
- differential equations (dynamic data)
- statistical / probability
- Fourier analysis

- It will be taught predominantly by Dave Wonnacott with Pre-CAtS faculty offering their expertise as guest lecturers.
- This course should be advertised to Bryn Mawr students since there are many Math and Physics majors there.
- Freshman will not be offered the course
- Written work (programming) and oral
presentations will be required of the students
- students will be required to do independent projects and they will be paired up (optimally from different major backgrounds)
- there will be three 1/2 hour presentations.

- Some key points that need to be
resolved:
- What modules should be presented
when?
- A possibility is to have different
focused modules each year depending on who is teaching the
course. This would attract different core students.
- However, this defeats the original goal of getting all students together from different majors.

- A possibility is to have different
focused modules each year depending on who is teaching the
course. This would attract different core students.
- What are the priorities (which topics must be covered)?
- What is the correct mix of:
- math
- algorithms
- natural science

- Recruiting Chemistry & Biology majors will be challenging
- What is the most fundamental core of each subject that should be covered? Then the course should go into more depth. The goal is to look for a good balance of fundamental/depth.
- During the first week have guest lecturers discuss areas of expertise to lure students
- the last 6 weeks of the course could focus on a duo subject matter (i.e., biochem, biophysics, etc.)

- What modules should be presented
when?

The Computer Science department will offer the course every 2 or 4 years. They do not have the faculty to do it more than that. The Physics Department and the Math Department will definitely teach the course, which brings up the frequency of the course being offered.

How about having a "follow-up" CAtS course from each department that would go into more depth?

As a reminder it was decided in the April 5, 2001 meeting that there will be 3 modules:

- Module 0
- types of topics to be covered
- introductory document that lists helpful background reading material/a short bibliography of reference materials
- subset of modules

- Module 1
- go through actual modules
- visit each module/topic once

- Module 2
- last four weeks of the course will be lab work

- Attended by:
- Julio dePaula, Chemistry
- John Dougherty, Computer Science
- Jerry Gollub, Physics
- Curtis Greene, Math
- Kate Heston, Biology
- Suzanne Amador Kane, Physics
- Rob Manning, Math
- Philip Meneely, Biology
- Kim Minor, Program Coordinator
- Jenni Punt, Biology
- Lyle Roelofs, Physics
- David Wonnacott, Computer Science