资源描述
A Statistical Machine Learning Perspective of Deep Learning: Algorithm, Theory, Scalable ComputingMaruanAl-Shedivat, ZhitingHu, HaoZhang, and Eric XingPetumInc&Carnegie Melon University Network switches Infiniband Stochastic Gradient Descent / Back propagation Graphical Models RegularizedBayesian Methods Deep Learning Sparse Coding Sparse StructuredI/O Regression Large-Margin Spectral/Matrix Methods Nonparametric Bayesian Models CoordinateDescent L-BFGS Gibbs Sampling Metropolis-Hastings Mahout(MapReduce) Mllib(BSP) CNTK MxNet Tensorflow(Async) Network attached storage Flash storage Server machines Desktops/Laptops NUMA machines Mobile devices GPUs, CPUs, FPGA, TPU ARM-powered devices RAM Flash SSD Cloud compute(e.g. Amazon EC2) IoT networks Data centers Virtual machinesHadoop Spark MPI RPC GraphLab TaskModelAlgorithmImplementationSystemPlatformand HardwareElement of AI/Machine LearningPetum,Inc.1ML vs DLPetum,Inc.2PlanStatistical And Algorithmic Foundation and Insight of Deep LearningOn Unified Framework of Deep Generative ModelsComputational Mechanisms: Distributed Deep Learning Architectures Petum,Inc.3Part-IBasicsOutlineProbabilistic Graphical Models: BasicsAn overview of DL componentsHistorical remarks: early days of neural networksModern building blocks: units, layers, activations functions, loss functions, etc.Reverse-mode automatic diferentiation (aka backpropagation)Similarities and diferences between GMs and NNsGraphical models vs. computational graphsSigmoid Belief Networks as graphical modelsDeep Belief Networks and Boltzmann MachinesCombining DL methods and GMsUsing outputs of NNs as inputs to GMsGMs with potential functions represented by NNsNNs with structured outputsBayesian Learning of NNsBayesian learning of NN parametersDeep kernel learningPetum,Inc.5OutlineProbabilistic Graphical Models: BasicsAn overview of DL componentsHistorical remarks: early days of neural networksModern building blocks: units, layers, activations functions, loss functions, etc.Reverse-mode automatic diferentiation (aka backpropagation)Similarities and diferences between GMs and NNsGraphical models vs. computational graphsSigmoid Belief Networks as graphical modelsDeep Belief Networks and Boltzmann MachinesCombining DL methods and GMsUsing outputs of NNs as inputs to GMsGMs with potential functions represented by NNsNNs with structured outputsBayesian Learning of NNsBayesian learning of NN parametersDeep kernel learningPetum,Inc.6Fundamental questions of probabilistic modelingRepresentation:what is the joint probability distr. on multiple variables?!(#$,#&,#,#)How many state configurations are there?Do they all need to be represented?Can we incorporate any domain-specific insights into the representation?Learning:where do we get the probabilities from?Maximum likelihood estimation? How much data do we need?Are there any other established principles?Inference:if not al variables are observable, how to compute the conditional distribution of latent variables given evidence?Computing !(+|-)would require summing over 2/configurations of the unobserved variablesPetum,Inc.7What is a graphical model?A possible world of cellular signal transductionPetum,Inc.8GM: structure simplifies representationA possible world of cellular signal transductionPetum,Inc.9
展开阅读全文